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STREPTOCOCCUS PNEUMONIAE DNA SEQUENCES 

This invention provides DNA sequences from the 
Streptococcus pneumoniae genome, and methods of use of DNA 
fragments originating therefrom in a variety of biological 
and pharmaceutical applications. 

The recent emergence of widespread antibiotic 
resistance in common pathogenic bacterial species has 
justifiably alarmed the medical and research communities. 
Frequently these organisms are co-resistant to several 
different antibacterial agents. Particularly problematic has 
been the emergence and rapid spread of penicillin resistance 
in Streptococcus pneumoniae f which frequently causes upper 
respiratory tract infections. Resistance to penicillin in 
this organism can be due to modifications of one or more of 
the penicillin-binding proteins (PBPs) . Combating the 
phenomenon of increasing resistance to antibiotic agents 
among pathogenic organisms such as Streptococcus pneumoniae 
will require intensified research into the fundamental 
molecular biology of such organisms. Greater knowledge about 
the molecular biology of pathogenic organisms will lead to 
new antibacterial agents having novel and effective, actions. 

While inroads in the development of new antibiotics and 
new targets for antibiotic compounds have been made with a 
variety of microorganisms, progress has been less apparent 
in Streptococcus pneumoniae. In part, Streptococcus 
pneumoniae presents a special case because this organism is 
highly recombinogenic and readily takes up exogenous DNA 
from its surroundings. Thus, the need for new antibacterial 
compounds and new targets for antibacterial therapy in 
Streptococcus pneumoniae is more acute than in other 
organisms . 



WO 98/26072 PCT/US97/22578 



-2- 



The present invention relates to the genome of S. 
pneumoniae. The genomic information disclosed by the present 
invention enables: (1) preparation of molecular 
hybridization probes for use in PCR amplification of genes 
and regulatory regions, physical mapping, sequencing, 
mutagenesis, and mutation analysis, (2) homology comparisons 
with the genomes and open reading frames (ORFs) of other 
organisms, (3) creation of specifically mutated strains of 
S. pneumoniae wherein the mutation is targeted to any site 
or sites in the DNA sequence disclosed herein, (4) 
identification of S. pneumoniae promoters and other gene 
regulatory sequences, (5) identification of proteins/ORFs 
encoded by S. pneumoniae, (6) identification of virulence 
genes in S. pneumoniae, (7) determination of the biological 
function of proteins/ORFs and RNAs encoded by S. pneumoniae, 
(8) production of kits useful for determining gene function 
in the cell, and kits for isolating and analyzing genes that 
are mutated in antibiotic resistant clinical isolates of S. 
pneumoniae, (9) production of proteins and RNAs encoded by 
S. pneumoniae, (10) production of antibodies against 
proteins and other antigens encoded by S. pneumoniae, (11) 
methods to identify compounds that bind to proteins and RNAs 
encoded by S. pneumoniae as potential new antibiotic 
compounds . 

In another embodiment the invention relates to 
substantially purified proteins encoded by the 5. pneumoniae 
genome . 

Table 1 summarizes the proteins and nucleic acids 
disclosed herein, contigs, SEQ ID NO' s and predicted 
functions. 
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"Genome" refers to the full complement of chromosomal 
and extra-chromosomal DNA within a cell. The genome 
comprises the genetic blueprint for all proteins and RNAs 
encoded by the cell or organism. 
5 "ORF" (i.e. "open reading frame") designates a region 

of genomic DNA beginning with a Met or other initiation 
codon and terminating with a translation stop codon, 
potentially encoding a protein product. "Partial ORF" means 
a portion of an ORF as disclosed herein such that the 
10 initiation codon, the stop codon, or both are not disclosed. 

"DNA chip" or "Bio Chip" or "Bio DNA chip" refers to a 
solid matrix or support onto which is applied an array of 
oligonucleotides, or nucleotide sequences, or gene 
fragments, or genomic fragments, of S. pneumoniae which may 
15 further comprise a layer of S. pneumoniae cells suspended 
thereover in a semisolid medium such as agar or agarose. 

"Consensus sequence" refers to an amino acid or 
nucleotide sequence that may suggest the biological function 
of a protein, DNA, or RNA molecule. Consensus sequences are 
20 identified by comparing proteins, RNAs, and gene homologs 
from different species. 

"Contiguous fragment building" or "Contiguous fragment" 
or "Contig" refers to the process and result, respectively, 
by which a fragment of DNA is assembled from smaller 
25 constituent DNA fragments by arranging the constituent 
pieces in their correct order and register such that the 
resulting contiguous fragment accurately depicts the native 
DNA sequence from which the smaller fragments originated. 
"Computer readable medium" includes, for example, a 
30 floppy disc, hard disc, random access memory, read only 
memory, and CD-ROM. 

The terms "cleavage" or "restriction" of DNA 
refers to the catalytic cleavage of the DNA with a 
restriction enzyme that acts only at certain sequences in 




WO 98/26072 PCI7US97/22578 

-4- 

the DNA (viz. sequence-specific endonucleases) . The various 
restriction enzymes used herein are commercially available 
and their reaction conditions, cofactors, and other 
requirements are used in the manner well known to one of 
5 ordinary skill in the art. Appropriate buffers and substrate 
amounts for particular restriction enzymes are specified by 
the manufacturer or can be found in the literature. 

"Diagnostics" as used herein relates to in vitro 
or in vivo diagnosis for disease states or biological status 

10 in mammals, preferably humans. 

"Therapeutics" and "therapeutic/diagnostic 
combinations" means the treatment, or diagnosis and 
treatment, of disease states or biological status by in vivo 
administration to mammals, preferably humans, of 

15 compositions of the present invention, for example, 
antibodies . 

"Essential genes" or "essential ORFs" or 
"essential proteins" refer to genomic information or the 
protein (s) or RNAs encoded therefrom, which, when disrupted 

20 by knockout mutation, or by other mutation, produce 
inviability in cells harboring said mutation. 

"Non-essential genes" or "non-essential ORFs" or 
"non-essential proteins" refer to genomic information or the 
protein (s) or RNAs encoded therefrom, which, when disrupted 

25 by knockout mutation, or other mutation, do not result in 
inviability of cells harboring said mutation. 

"Minimal gene set" refers to a genus of about 256 
genes that are conserved among different bacteria such as M. 
genitalium and H. influenzae. The minimal gene set appears 

30 to be necessary and sufficient to sustain life. See e.g. A. 
Mushegian and E. Koonin, "A minimal gene set for cellular 
life derived by comparison of complete bacterial genomes" 
Proc. Nat. Acad. Sci. 93, 10268 - 273 (1996). 
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The term "fragment thereof" denotes a fragment of 
a nucleic acid molecule described herein, wherein said 
fragment comprises a region of contiguity within said 
nucleic acid of at least 15 base pairs. The term may also 
refer to a peptide of at least 5 contiguous amino acid 
residues of a protein disclosed herein. 

The term "plasmid" refers to an extrachromosomal 
genetic element. The starting plasmids herein are either 
commercially available, publicly available on an 
unrestricted basis, or can be constructed from available 
plasmids in accordance with published procedures. In 
addition, equivalent plasmids to those described are known 
in the art and will be apparent to the ordinarily skilled 
artisan. 

"Recombinant DNA cloning vector" as used herein 
refers to any autonomously replicating agent, including, but 
not limited to, plasmids and phages, comprising a DNA 
molecule to which one or more additional DNA segments can or 
have been added. 

The term "recombinant DNA expression vector" as 
used herein refers to any recombinant DNA cloning vector, 
for example a plasmid or phage, in which a promoter and 
other regulatory elements are present to enable 
transcription of the inserted DNA. 

The term "vector" as used herein refers to a 
nucleic acid compound used for introducing exogenous DNA 
into host cells. A vector comprises a nucleotide sequence 
which may encode one or more protein molecules. Plasmids, 
cosmids, viruses, and bacteriophages, in the natural state 
or which have undergone recombinant engineering, are 
examples of commonly used vectors. 

The terms "complementary" or "complementarity" as 
used herein refers to the capacity of purine and pyrimidine 
nucleotides to associate through hydrogen bonding in double 
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stranded nucleic acid molecules. The following base pairs 
are complementary: guanine and cytosine; adenine and 
thymine; and adenine and uracil. 

"Oligonucleotide" refers to a short polymeric 
5 nucleotide chain comprising from about 2 to 25 nucleotides. 

"Isolated nucleic acid compound" refers to any RNA 
or DNA sequence, however constructed or synthesized, which 
is locationally distinct from its natural location. 

A "primer" is a nucleic acid fragment which 
10 functions as an initiating substrate for enzymatic or 
synthetic elongation of a nucleic acid molecule. 

The term "promoter" refers to a DNA sequence which 
directs transcription of DNA to RNA. 

A "probe" as used herein is a labeled nucleic acid 
15 compound which can be used to hybridize with another nucleic 
acid compound. 

The term "hybridization" or "hybridize" as used 
herein refers to the process by which a single-stranded 
nucleic acid molecule joins with a complementary strand 
20 through nucleotide base pairing. 

"Recorded" as used herein refers to a process for 
storing information on a computer readable medium. 

"Substantially identical" means a sequence having 
sufficient homology to hybridize under high stringency 
25 conditions and/or at least 90% identity at the nucleotide or 
amino acid sequence level to a sequence disclosed herein. 

"Substantially purified" when used in reference to 
a protein or peptide means that the molecule has been 
largely, but not necessarily wholly, separated and purified 
30 from other cellular and non-cellular components. Typically a 
protein is substantially pure when it is at least about 60% 
by weight, free from other naturally occurring organic 
molecules. Preferably the purity is at least about 75%, more 
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preferably at least about 90%, and most preferably at least 
about 99% by weight pure. 

"Selective hybridization" refers to hybridization 
under conditions of high stringency. Hybridization of 
5 nucleic acid molecules depends upon factors such as the 
degree of complementarity/ stringency of hybridization 
conditions, and the length of hybridizing strands. 

The term "stringency" relates to nucleic acid 
hybridization conditions. High stringency conditions 

10 disfavor non-homologous base pairing. Low stringency 
conditions have the opposite effect. Stringency may be 
altered, for example, by changes in temperature and salt 
concentration. Typical high stringency conditions comprise 
hybridizing at 50°C to 65°C in 5X SSPE and 50% formamide, 

15 and washing at 50°C to 65°C in 0.5X SSPE; typical low 

stringency conditions comprise hybridizing at 35°C to 37°C 

in 5X SSPE and 4 0% to 45% formamide and washing at 42°C in 
1X-2X SSPE. 

"SSPE" denotes a hybridization and wash solution 
20 comprising sodium chloride, sodium phosphate, and EDTA, at 
pH 7.4. A 20X solution of SSPE is made by dissolving 174 g 
of NaCl, 27.6 g of NaH 2 P04-H 2 0, and 7.4 g of EDTA in 800 ml 
of H2O. The pH is adjusted with NaOH and the volume brought 
to 1 liter. 

25 "SSC" denotes a hybridization and wash solution 

comprising sodium chloride and sodium citrate at pH 7. A 20X 
solution of SSC is made by dissolving 175 g of NaCl and 88 g 
of sodium citrate in 800 ml of H2O. The volume is brought to 
1 liter after adjusting the pH with 10N NaOH. 

"Virulence gene" as used herein means a gene from 
a pathogenic organism such as S. pneumoniae that is required 
for infection and/or pathogenicity in vivo. Some virulence 



30 
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genes are induced during infection of a host; others are 
expressed exclusively during in vivo infection. 

The Streptococcus pneumoniae genome contains about 2.2 
million nucleotide base pairs and comprises about 2000 to 
3000 ORFs and other genes. This invention provides, among 
other things, contiguous fragments, genes, and proteins from 
the S. pneumoniae genome (SEQ ID NO:l through SEQ ID 
NO:228) . 

Strain differences in S. pneumoniae may be associated 
with nucleotide sequence differences in one or more of the 
genomic fragments disclosed herein. Sequences that are 
substantially identical to the sequences disclosed herein 
are intended to be within the scope of the invention. 

The sequence fragments disclosed herein provide a wide 
variety of utilities. For example, the fragments may be used 
to identify regions of the S. pneumoniae genome that are 
expressed as proteins (viz. transcribed into mRNA) . The 
genomic fragments disclosed herein can also be used to 
examine differential expression of S. pneumoniae genes under 
diverse environmental conditions, as occurs, for example, 
with the expression of virulence genes during in vivo 
infection of a host organism. Also contemplated by the 
invention are: (1) preparation of molecular hybridization 
probes for use in physical mapping, sequencing, mutagenesis, 
mutation analysis, (2) homology comparisons of the sequences 
disclosed herein with the genomes and ORFs of other 
organisms, (3) creation of specifically mutated strains of 
S. pneumoniae wherein the mutation is targeted to any site 
in the DNA sequence disclosed herein, (4) identification of 
S. pneumoniae promoters and other gene regulatory sequences, 
(5) identification of proteins and RNAs encoded by S. 
pneumoniae, (6) amplification of S. pneumoniae genes using 
the PCR, and (7) production of kits for isolating and 
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analyzing genes that are mutated in antibiotic resistant 
clinical isolates of S. pneumoniae. 

Genome Analysis 

In one embodiment, the invention comprises the ORFs and 
fragments thereof encoded by the nucleotide sequences 
disclosed herein. Some of the nucleotide sequences disclosed 
herein encode ORFs and fragments of ORFs (Table 1) . The ORFs 
or fragments thereof were identified by translation of the 
nucleic acid sequences disclosed herein. The biological 
function of a protein disclosed in Table 1 was determined by 
homology comparison with known proteins from other 
organisms. A number of computer programs are available to 
assist in homology comparisons, for example Genemark 
(Borodovsky and Mclninch, Computers Chem. 17(2), 123, 1993). 

Computer-Related Applications 

The nucleotide and/or amino acid sequence information 
of this invention may be provided in a variety of media to 
facilitate use. In one embodiment the present invention 
comprises one or more of the sequences disclosed herein 
recorded on a computer readable medium. A variety of media 
are contemplated, for example, magnetic storage media such 
as floppy discs, hard disc storage, magnetic tape, and CD- 
ROM. A skilled artisan can readily adopt any presently known 
method for recording, information on a computer readable 
medium to generate manufactures comprising the nucleotide or 
amino acid sequence information of the present invention. 
These embodiments are contemplated within the scope of this 
invention. 

The choice of a data storage structure will generally 
be based on the means chosen to access the stored 
information. A variety of data processor programs and 
formats can be used to store the sequence information of the 
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invention on computer readable medium. For example, the 
sequence can be represented in a word processing text file 
that is formatted in commercially available software such as 
WordPerfect and Microsoft Word, or it can be represented in 
the form of a text only file such as ASCII . 

Having 5. pneumoniae genomic sequence information in a 
computer readable format enables a skilled artisan to access 
the information for a variety of purposes. For example, 
computer-assisted searching algorithms may be used to 
identify open reading frames, and ascertain biological 
function based on homology to known proteins from other 
organisms. Suitable algorithms for sequence comparisons 
include BLAST (Altschul et ai., J. Mol. Biol. 215, 403-410, 
1990) and BLAZE (Brutlag et ai., Comp. Chem. 17, 203-207 
(1993) . For identification of ORFs a number of commercially 
available software programs are suitable, such as FRAMES 
(Genetic Center Group, Madison, WI) . 

The genomic information of this invention in computer- 
readable form can be manipulated further using 
bioinformatics to identify the biological function of 
proteins encoded by ORFs as well as the cellular location of 
said proteins. The skilled artisan will recognize several 
computer-assisted algorithms for this purpose, for example, 
PSORT which is useful for determining the likely location of 
a protein within a cell (See K. Nakai & M. Kanehisa. "Expert 
system for predicting protein localization sites in Gram- 
negative bacteria", Proteins: Structure, Function, and 
Genetics, 11, 95-110 (1991). 

Open Reading Frames and Proteins 

The invention also provides proteins encoded by the S. 
pneumoniae genome in substantially purified form (See Table 
1) . The proteins are classified herein as (1) Hypothetical, 
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(2) Cell wall biosynthetic, (3) External target/ or (4) 
Minimal gene set proteins. 

Cells that carry knockout mutations in proteins of the 
hypothetical class are nonviable. Loss of viability suggests 
that these proteins may be essential for viability. Two such 
proteins, whose genes map to contigs m014 and m016, 
correspond respectively to Haemophillvs influenzae ORFs 
HI1146 and HI1648. Two other hypothetical proteins, yyaF and 
ywbL, correspond to a GTP binding protein and 
transcriptional regulator, respectively. 

The proteins of this invention can be used to raise 
antibodies. Antibodies against the hypothetical class of 
proteins are especially attractive. In targeting 
presumptively essential cellular functions, antibodies 
against "hypothetical proteins" could have therapeutic or 
prophylactic applications. Additionally, the "hypothetical" 
proteins can be used to screen for agents that bind or 
otherwise interact with said proteins. Such agents could 
lead to the identification of new antibacterial agents. 

Proteins classified in Table 1 as cell wall 
biosynthetic proteins, and external target proteins, were 
identified by homology with known proteins. These proteins 
are useful for identifying agents that bind and inhibit 
bacterial growth. Therefore, in another embodiment of the 
invention, the proteins of these classifications are 
prepared, preferably by recombinant means as described 
herein, substantially purified, and used in a screen to 
identify compounds that bind and/or inhibit the activity of 
said proteins. A variety of suitable screens are 
contemplated for this purpose. For example, the protein (s) 
can be labeled by known techniques such as radiolabeling or 
fluorescent tagging, or by labeling with biotin/avidin; 
thereafter binding of a test compound to a labeled protein 
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can be determined by any suitable means, well known to the 
skilled artisan. 

The proteins categorized as "minimal gene set" are 
homologous to a set of highly conserved proteins found in 
other bacteria. The minimal gene set proteins are thought to 
be essential for viability, and are useful targets for the 
development of new antibacterial compounds. 

DNA Chips and Applications 

The nucleic acids disclosed herein, or subfragments 
thereof, may be arrayed on any suitable solid surface, 
thereby constructing a x% chip." DNA chip hybridizations 
provide greater sensitivity than do conventional 
hybridization means, such as Southern hybridization or 
Northern hybridization. DNA chips are useful for a variety 
of purposes, for example, in mutation and gene expression 
analysis, and in probing the structure, function, and 
expression of the genome. This aspect of the invention 
relates to any one or more of the DNA fragments disclosed 
herein, wherein said fragments are attached to a solid 
support (i.e. "chip" or "DNA chip" or "Bio chip"). Attachment 
of a nucleic acid to a support can be, but is not 
necessarily, accomplished by chemical or enzymatic means. 

In one embodiment, DNA fragments of this invention are 
arrayed onto a solid support as a means for assessing gene 
expression in S. pneumoniae. The DNA fragments attached to a 
chip may be of any size that is suitable for hybridization 
to other nucleic acid molecules such as cDNAs, genomic DNAs, 
or RNAs. Suitably-sized DNA fragments are from 10 nucleotide 
residues to approximately several thousand residues. The 
preferred length is about 50 to 500 nucleotides. 

Analysis of gene expression using the chips of this 
invention is assessed by hybridization of a chip to RNA 
samples, or cDNA samples prepared from S. pneumoniae grown 



WO 98726072 PCT/US97/22578 



-13- 

under any suitable conditions. Preferred samples for 
hybridization to a chip comprise cDNA. Methods for preparing 
RNA or cDNA are well known in the art. 

A variety of suitable methods are known for fixing DNA 
5 fragments to solid support matrices [See e.g. D. Stimpson et 
al. "Real-time detection of DNA hybridization and melting on 
oligonucleotide arrays by using optical wave guides" Proc. 
Nat. Acad. Sci. 92, 6379 (1995)] Preferred surfaces for 
producing a chip are glass or polystyrene. Convenient 

10 surfaces are microscope slides, or cover slips (Corning) , 
treated with silicon or silane to minimize non-specific 
binding by DNA or proteins. Also suitable for this purpose 
are 96-well microtiter plates. 

A light-directed method may be used for attaching 

15 oligonucleotides, enabling nucleotide synthesis directly on 
the solid surface using photolabile 5 'protected N-acyl- 
deoxynucleotide phosphoramidites and surface linker 
chemistry (See Pease et al. "Light-generated oligonucleotide 
arrays for rapid DNA sequence analysis" Proc. Nat. Acad. 

20 Sci. 91, 5022-5026, 1994). Alternatively, DNA fragments can 
be bound to a surface via interaction with a specific DNA 
binding protein. Any suitable DNA binding protein may be 
used, for example bacteriophage DNA binding proteins, 
Adenovirus binding protein, the E. coli lac-repressor 

25 protein, or 1-repressor protein. DNA binding proteins are 
attached to the surface of a chip by covalent chemical 
binding, essentially as described in U.S. Patent 5,561,071, 
the entire contents of which is incorporated by reference. 
The latter method requires that DNA fragments contain a 

30 recognition sequence that enables binding by the DNA binding 
protein. Specific sequences for a number of DNA binding 
proteins are known. Methods for incorporating specific 
binding sequences into the genomic DNA fragments disclosed 
herein are well known in the cloning arts. 
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DNA chip technology enables monitoring S. pneumoniae 
gene expression on a genome-wide level. This feature of the 
invention is particularly attractive for identifying (1) 
genes that are expressed or not expressed during the life 
cycle or infection cycle of S. pneumoniae, and (2) changes 
in gene expression that correlate with environmental change. 

For example, virulence genes in S. pneumoniae can be 
identified by the DNA chip method disclosed herein. 
Identification of virulence genes in S. pneumoniae will 
provide new targets for developing novel antibiotics. For 
this aspect of the invention any suitable encapsulated 
strain of S. pneumoniae is introduced into a mouse, for 
example, by intraperitoneal injection, or by introduction 
directly into the lungs, or by any other suitable method. 
Approximatly 2 days after infection a peripheral blood titre 
level is reached of about 10 8 S. pneumoniae cells/ml. Cells 
recovered from peripheral blood, or other suitable tissue, 
are used in identifying virulence genes. For this purpose, 
cDNAs are prepared from cells recovered from an in vivo 
infection and from cells grown in vitro. After labeling, the 
cDNAs are hybridized against the DNA chip(s) disclosed 
herein. Genomic fragments that hybridize to the in vivo 
probe but not to the in vitro probe identify candidate 
virulence genes. 

Also contemplated by this aspect of the invention is a 
method for analyzing gene expression in S. pneumoniae cells 
grown or harvested from any desireable in vitro or in vivo 
environment, wherein said environment may include compounds 
whose effects on gene expression are to be determined. 

In another embodiment, the present invention relates to 
a DNA bio-chip, useful for correlating DNA sequence with 
biological function. The bio-chip comprises an array of the 
genomic DNA fragments disclosed herein, or portions thereof, 
attached to the surface of any suitable solid support 
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material. The bio chip further comprises a layer of 
competent S. pneumoniae cells suspended over the DNA array 
in any suitable semi-solid medium such as agar or agarose. 
The cells suspended on the bio chip comprise known or 
5 unknown mutant strains, or they may be wild-type cells. The 
cell layer is in contact with the DNA matrix such that DNA 
on the chip can be taken up by the cells. 

The bio-chip is useful for several purposes. For 
example, the bio-chip can be used to localize an unknown 

10 mutation to a specific region of the genome by 

complementation. The bio-chip enables correlating a 
phenotype with a genetic locus. For example, mutant cells 
harboring one or more mutations and having at least one 
screenable or selectable phenotype can be applied to a bio 

15 chip and subjected to an environment that allows for 

selection, or for screening by complementation. If said 
phenotype 'is the result of a chromosomal mutation or 
mutations that map to a genomic fragment present on the 
chip, DNA uptake by the cells and repair of the mutation by 

20 recombination will be identifiable by a suitable screen or 
selection. 

In a preferred embodiment, the bio-chip is overlayed 
with competent S. pneumoniae cells. Methods for preparing 
competent cells are known (See e.g. LeBlanc et.aJ. Plasmid 
25 28, 130-145, 1992; Pozzi et al. J. Bacteriol . 178, 6087-6090, 
1996) . 

Other embodiments of this aspect of the invention are 
contemplated. For example the genomic fragments disclosed 
herein could be prepared and dispensed into individual wells 
30 of a 96-well micro titre plate. Competent S. pneumoniae 
cells could then be added to the wells under conditions 
suitable for DNA uptake followed by plating onto any 
suitable selection or screening medium, for example an agar 




WO 98/26072 PCT/US97/22578 

-16- 

plate containing suitable growth and/or selection/screening 
components . 



Diagnostic Kits and Assays 
5 The present invention further relates to kits and 

assays that can be used for rapid and efficient detection of 
S. pneumoniae cells. Also contemplated are kits for 
detecting mutations carried by S. pneumoniae cells. Kits of 
this nature are particularly attractive in the clinical 

10 environment where knowledge about the identity of a pathogen 
and/or of the basis for resistance to antibiotic treatments 
is essential for effective medical . treatment . In the long 
term, knowledge of the mutations that lead to resistance 
will enable the design of new antibacterial agents, 

15 A kit for detecting S. pneumoniae cells can be based on 

antibody recognition of S. pneumoniae specific antigens or 
epitopes, or by nucleic acid hybridization techniques for 
the detection of S. pneumoniae specific nucleic acid 
molecules . 

20 A variety of embodiments are contemplated in this 

aspect of the invention. In one embodiment a kit is provided 
for detecting mutations in drug-resistant S. pneumoniae. For 
this purpose, DMA is prepared from a resistant isolate and 
from a wild-type strain. In a preferred embodiment, the 

25 polymerase chain reaction (i.e. PCR) is used to amplify DNA 
samples representing any one or all of the genomic fragments 
disclosed herein. The amplified DNAs from the mutant and 
wild-type cells are hybridized to a DNA chip having fixed 
thereon any one or more of the genomic fragments disclosed 

30 herein. Amplified DNA samples from the mutant and wild-type 
strain are labeled by any suitable means, for example using 
radioisotopes or fluorescent labeling. Hybridization of the 
amplified DNAs to the chip under conditions that can 
discriminate single or multiple base pair mismatches enables 
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the detection of differences between the mutant and wild- 
type samples. This method identifies a specific fragment of 
the genome that is altered in the mutant strain. The 
specific mutation can be determined by conventional DNA 
sequence analysis. 

This aspect of the invention also relates to the 
detection of S. pneumoniae proteins in a sample using 
antibody molecules raised against any suitable ORF disclosed 
herein. Antibody detection methods are well known to those 
skilled in the art including, for example, a variety of 
radioimmunological assays. (See e.g. P. Tijssen, Practice 
and Theory of Enzyme Immunoassays: Laboratory Techniques in 
Biochemistry and Molecular Biology , Elsevier Science 
Publishers, Amsterdam, The Netherlands, 1985) . 

Test samples suitable for use in this aspect of the 
invention include but are not limited to biological fluids 
such as sputum, blood, serum, plasma, urine, and to biopsy 
samples . 

Skilled artisans will recognize that the disclosed 
method and reagents can be readily incorporated into a kit. 
For example, a kit would contain one or more receptacles 
comprising one or more of the following: PCR reagents, DNA 
chip reagents, labeling reagents, assorted buffers, and/or 
antibodies. 

Production of Antibodies 

The proteins of this invention and fragments 
thereof may be used in the production of antibodies. The 
term "antibody" as used herein describes antibodies, 
fragments of antibodies (such as, but not limited, to Fab, 
Fab f , Fab2 ! / and Fv fragments), and chimeric, humanized, 
veneered, resurfaced, or CDR-grafted antibodies capable of 
binding antigens of a similar nature as the parent antibody 
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molecule from which they are derived. The instant invention 
also encompasses single chain polypeptide binding molecules. 

The production of antibodies, both monoclonal and 
polyclonal, in animals is well known in the art. See, e.g., 
5 C. Milstein, Handbook of Experimental Immunology , (Blackwell 
Scientific Pub., 1986); J. Goding, Monoclonal Antibodies: 
Principles and Practice , {Academic Press, 1983) . For the 
production of monoclonal antibodies the process begins with 
injecting a mouse, or other suitable animal, with an 

10 immunogen. The mouse is subsequently sacrificed and cells 
taken from its spleen are fused with myeloma cells, 
resulting in a hybridoma that can be cultured in vitro. 
Hybridomas are screened for clones that secrete a single 
antibody species, specific for the immunogen. 

!5 Chimeric antibodies, described in U.S. Patent No. 

4,816,567, herein incorporated by reference, teaches methods 
and vectors for preparing chimeric antibodies. An 
alternative approach is provided in U.S. Patent No. 
4, 816,397, the entire 'contents of which is herein 

20 incorporated by reference. This patent teaches co- 
expression of heavy and light chains in the same host cell. 

The method taught in U.S. Patent 4,816,397 has 
been further refined in European Patent Publication No. 0 
239 400. The teachings of this publication are preferred for 

25 engineering monoclonal antibodies. In this technology the 
complementarity determining regions (CDRs) of a human 
antibody are replaced with the CDRs of a murine monoclonal 
antibody, thereby converting the specificity of the human 
antibody to the specificity of the murine antibody. 

30 Single chain antibodies and libraries thereof 

provide yet another means for genetically engineering 
antibody molecules. (See, e.g. R.E. Bird, et al., Science 
242:423-426 (1988); PCT Publication Nos. WO 88/01649, WO 
90/14430, and WO 91/10737. Single chain antibody technology 
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involves covalently joining the binding regions of heavy and 
light chains thereby generating a single polypeptide chain 
having the binding specificity of an intact antibody 
molecule. 



are useful in diagnostics, therapeutics, or in 
diagnostic/ therapeutic combinations . 

The proteins of this invention, or suitable fragments 
thereof, can be used to generate polyclonal or monoclonal 

10 antibodies, and various inter-species hybrids, or humanized 
antibodies, or antibody fragments, or single-chain 
antibodies. The techniques for producing antibodies are well 
known to skilled artisans. (See e.g. A.M. Campbell, 
Monoclonal Antibody Technology: Laboratory Techniques in 

15 Biochemsitry and Molecular Biology , Elsevier Science 

Publishers, Amsterdam (1984); Kohler and Milstein, Nature 
256, 495-497 (1975); Monoclonal Antibodies: Principles & 
Applications Ed. J. R. Birch & E.S. Lennox, Wiley-Liss, 1995. 



20 administered in an adjuvant by subcutaneous or 

intraperitoneal injection into, for example, a mouse or a 
rabbit. For the production of monoclonal antibodies, spleen 
cells from immunized animals are removed, fused with myeloma 
cells, such as SP2/0-Agl4 cells, and allowed to become 

25 monoclonal antibody producing hybridoma cells in the manner 
known to the skilled artisan. Hybridomas that secrete the 
desired antibody molecule can be screened by a variety of 
well known methods, for example EL ISA assay, western blot 
analysis, or radioimmunoassay (Lutz et al. Exp. Cell Res. 

30 175, 109-124 (1988); Monoclonal Antibodies: Principles & 

Applications Ed. J. R. Birch & E.S. Lennox, Wiley-Liss, 1995) . 

For some applications it is desireable to have an 
antibody labeled in some fashion. Procedures for labeling 
antibody molecules with radioisotopes, affinity labels, such 



5 



The antibodies contemplated by the present invention 



A protein or peptide to be used as an immunogen may be 
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as biotin or avidin, enzymatic labels, for example 
horseradish peroxidase, and fluorescent labels such as FITC 
or rhodamine, are widely known (See e.g. Enzyme-Mediated 
Immunoassay / Ed. T. Ngo, H. Lenhoff, Plenum Press 1985; 
5 Principles of Immunology and Immunodiagnostics , R.M. Aloisi, 
Lea & Febiger, 1988) . 

Labeled antibodies are useful for a variety of 
diagnostic applications. In one embodiment, the present 
invention relates to the use of labeled antibodies to detect 

10 the presence of S. pneumoniae cells and proteins. Also 
contemplated are applications that use antibodies, 
preferably single chain antibodies, directed against a S. 
pneumoniae protein. Proteins identified as "external 
targets" are preferred for the generation of single chain 

15 antibodies. Single chain antibody libraries directed against 
S. pneumoniae surface proteins and cell wall proteins can be 
produced by applying the phage display technique to crude 
membrane preparations. Antibodies that recognize and bind to 
external target proteins and/or cell wall proteins could be 

20 used as therapeutic agents to inhibit the growth of 5. 

pneumoniae. Alternatively, the antibodies could be used in a 
screen to identify potential inhibitors of an external 
target protein. For example, in a competitive displacement 
assay, an antibody or compound to be tested is labeled by 

25 any suitable method. Competitive displacement of an antibody 
from an antibody-antigen complex by a test compound provides 
a means to identify new antibacterial compounds. 



Protein Production Methods 
30 The present invention relates further to 

substantially purified proteins encoded by the ORFs 
disclosed herein (SEQ ID NO: 87 through SEQ ID NO: 228) . 

Skilled artisans will recognize that proteins can 
be synthesized by different methods, for example, chemical 
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methods or recombinant methods , as described in U.S. Patent 
4,617,149, hereby incorporated by reference. 

The principles of solid phase chemical synthesis 
of polypeptides are well known in the art and may be found 
5 in general texts relating to this area. See, e.g., H. Dugas 
and C. Penney, Bioorganic Chemistry (1981) Springer-Verlag, 
New York, 54-92. Peptides may be synthesized by solid-phase 
methodology utilizing an Applied Biosystems 430A peptide 
synthesizer (Applied Biosystems, Foster City, CA) and 

10 synthesis cycles supplied by Applied Biosystems. Protected 
amino acids, such as t-butoxycarbonyl-protected amino acids, 
and other reagents are commercially available from many 
chemical supply houses. 

The proteins and peptides of the present invention 

15 can also be made by recombinant DNA methods. Recombinant 
methods are preferred if a high yield is desired. 
Recombinant methods involve expressing a cloned ORF/gene in 
a suitable host cell. A gene is introduced into a host cell 
by any suitable means, well known to those skilled in the 

20 art. While chromosomal integration of a cloned gene is 

within the scope of the present invention, it is preferred 
that a cloned gene be maintained extra-chromosomally, as 
part of a vector wherein the gene is in operable-linkage to 
a constitutive or inducible promoter. 

25 Recombinant methods are also useful in 

overproducing a membrane-bound or membrane-associated 
protein. In some cases, membranes prepared from recombinant 
cells that overexpress such proteins provide an enriched 
source of the protein. Such membranes are useful for 

30 evaluating the function of the protein and/or for evaluating 
inhibitors of the protein. 
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Expressing Recombinant Proteins in Procaryotic and 
Eucaryotic Host Cells 

Procaryotes are generally used for cloning DNA 
sequences and for constructing vectors. For example, the 
5 Escherichia coli K12 strain 294 (ATCC No. 31446) is 

particularly useful for expression of foreign proteins. 
Other strains of E. coli, bacilli such as Bacillus subtilis, 
enterobacteriaceae such as Salmonella typhimvrium or 
Serratia marcescans, various Pseudomonas species may also be 
10 employed as host cells in cloning and expressing the 

recombinant proteins of this invention. Also contemplated 
are various strains of Streptococcus and Streptocmyces. 

For effective expression of a recombinant protein 
a gene or ORF may be linked to a known promoter sequence, 
15 Suitable bacterial promoters include b -lactamase [e.g. 
vector pGX2907, ATCC 39344, contains a replicon and b - 
lactamase gene], lactose systems [Chang et al., Nature 
(London), 275:615 (1978); Goeddel et al., Nature (London), 
281:544 (1979)], alkaline phosphatase, and the tryptophan 
20 (trp) promoter system [vector pATHl (ATCC 37695) ] designed 
for the expression of a trpE fusion protein. Hybrid 
promoters such as the tac promoter (isolatable from plasmid 
pDR54 0, ATCC-37282) are also suitable. Promoters for use in 
bacterial systems also will contain a Shine-Dalgarno 
25 sequence operably linked to the DNA encoding the desired 
polypeptides. These examples are illustrative rather than 
limiting. 

A variety of mammalian cell systems and yeasts are 
also suitable host cells. The yeast Saccharomyces 
30 cerevisiae is a commonly used eucaryotic microorganism. 

Other yeasts such as Kluyveromyces lactis are also suitable. 
For expression of recombinant genes in Saccharomyces, the 
plasmid YRp7 (ATCC-40053) , for example, may be used. See, 
e.g., L. Stinchcomb, et al., Nature, 282:39 (1979); J. 
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Kingsman et al., Gene, 7:141 (1979); S. Tschemper et al., 
Gene, 10:157 (1980) . Plasmid YRp7 contains the TRP1 gene 
that provides a selectable marker in a trpl mutant. 



5 Purification of Recombinantly-Produced Protein 

An expression vector carrying an ORF of the 
present invention is transformed or transfected into a 
suitable host cell using standard methods. Cells which 
contain the vector are propagated under conditions suitable 

10 for expression of the encoded protein. If the gene is under 
the control of an inducible promoter then suitable growth 
conditions would incorporate the appropriate inducer. The 
recombinantly-produced protein may be purified from cellular 
extracts of transformed cells by any suitable means. 

15 In a preferred process for protein purification a 

gene/ORF is modified at the 5' end, or some other position, 
to incorporate a plurality of histidine residues at the 
amino terminus of the encoded protein. The "histidine tag" 
produced thereby enables a single-step protein purification 

20 method referred to as "immobilized metal ion affinity 

chromatography" (IMAC), essentially as described in U.S. 
Patent 4,569,794, hereby incorporated by reference. The IMAC 
method enables rapid isolation of substantially pure protein 
starting from a crude cellular extract. 

25 As skilled artisans will recognize, the proteins 

of the invention can be encoded by a multitude of different 
nucleic acid sequences owing to the degeneracy of the 
genetic code. The present invention further comprises these 
alternate nucleic acid sequences. 

30 The ribonucleic acid compounds of the present 

invention may be prepared using the polynucleotide synthetic 
methods discussed supra, or they may be prepared 
enzymatically using RNA polymerase to transcribe a DNA 
template. 
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The most preferred systems for preparing the 
ribonucleic acids of the present invention employ the RNA 
polymerase from the bacteriophage T7 or the bacteriophage 
SP6. These RNA polymerases are highly specific, requiring 
5 the insertion of bacteriophage-specif ic sequences at the 5 1 
end of the template to be transcribed. See, J. Sambrook, et 
al., supra, at 18.82-18.84. 

This invention also provides nucleic acids, RNA or 
DNA, which are complementary to the sequences disclosed 
10 herein. 

The present invention also provides probes and 
primers useful for a variety of molecular biology techniques 
including, for example, hybridization screens of genomic or 
subgenomic libraries, detection and quantification of mRNA 

15 species as a means to analyzing gene expression, and 

amplification of any region of the Streptococcus pneumoniae 
genome disclosed by the sequences herein. A nucleic acid 
compound is provided comprising any of the sequences 
disclosed herein, or a complementary sequence thereof, or a 

20 fragment thereof, which is at least 15 base pairs in length, 
and which will hybridize selectively to Streptococcus 
pneumoniae DNA or mRNA. Preferably, the 15 or more base pair 
compound is DNA. A probe or primer length of at least 15 
base pairs is dictated by theoretical and practical 

25 considerations. See e.g. B. Wallace and G. Miyada, 

"Oligonucleotide Probes for the Screening of Recombinant DNA 
Libraries," In Methods in Enzymoloqy , Vol. 152, 432-442, 
Academic Press (1987) . 

The probes and primers of this invention can be 

30 prepared by methods well known to those skilled in the art 
(See e.g. Sambrook et ai. supra). In a most preferred 
embodiment these probes and primers are synthesized by the 
polymerase chain reaction (PCR) . 
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The present invention also relates to recombinant 
DNA cloning vectors and expression vectors comprising the 
nucleic acids of the present invention. Preferred nucleic 
acid vectors are those which comprise DNA. The skilled 
5 artisan understands that choosing the most appropriate 

cloning vector or expression vector depends on a number of 
factors including the availability of restriction enzyme 
sites, the type of host cell into which the vector is to be 
transfected or transformed, the purpose of the transfection 

10 or transformation (e.g., stable transformation as an 
extrachromosomal element, or integration into the host 
chromosome), the presence or absence of readily assayable or 
selectable markers (e.g., antibiotic resistance and 
metabolic markers of one type and another) , and the number 

15 of gene copies desired in the host cell. 

Vectors suitable to carry the nucleic acids of the 
present invention comprise RNA viruses, DNA viruses, lytic 
bacteriophages, lysogenic bacteriophages, stable 
bacteriophages, plasmids, viroids, and the like. The most 

20 preferred vectors are plasmids. 

Host cells harboring the nucleic acids disclosed 
herein are also provided by the present invention. A 
preferred host is E. coli which has been transfected or 
transformed with a vector that comprises a nucleic acid of 

25 the present invention. 

The present invention also provides a method for 
constructing a recombinant host cell capable of expressing 
an ORF disclosed herein, said method comprising transforming 
or otherwise introducing into a host cell a recombinant DNA 

30 vector that comprises an isolated DNA sequence which encodes 
said ORF. The preferred host cell is any strain of E. coli 
which can accomodate high level expression of an exogenously 
introduced gene. Transformed host cells are cultured under 
conditions well known to skilled artisans such that said ORF 
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is expressed, thereby producing the encoded protein in the 
recombinant host cell. 

For the purpose of discovering new inhibitors of 
cell wall biosynthesis, it would be desirable to determine 
5 agents that inhibit enzymes required for synthesis of the 
cell wall and/or agents that interact with membrane 
proteins. A method for identifying compounds that interact 
with such enzymes and membrane proteins comprises contacting 
said proteins with a test compound and monitoring an 
10 interaction and/or inhibition by any suitable means. 

The instant invention provides a screening system 
for compounds that interact with membrane proteins of this 
invention, said screening system comprising the steps of: 

a) preparing a membrane protein, or membranes 
15 enriched in said protein; 

b) exposing the protein source of (a) to a test 
compound; and 

c) quantifying the interaction of said protein with 
said compound by any suitable means. 

20 

The screening method of this invention may be 
adapted to automated procedures such as a PANDEX® (Baxter- 
Dade Diagnostics) system, allowing for efficient high-volume 
screening of compounds. 

25 In a typical screening protocol, a protein to be 

tested is prepared as described herein, preferably using 
recombinant DNA technology. A test compound is introduced 
into a reaction vessel containing said protein. The 
reaction/interaction of said protein and said compound is 

30 monitored by any suitable means. For example, a 

radioactively-labeled or chemically-labeled compound or 
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protein may be used. Specific association between a test 
compound and protein is monitored by any suitable means. 

The following examples more fully describe the 
present invention. Those skilled in the art will recognize 
5 that the particular reagents, equipment, and procedures 
described are merely illustrative and are not intended to 
limit the present invention in any manner. 

EXAMPLE 1 

10 Vector for Expressing S. pneumoniae ORF in a Host Cell 

An expression vector suitable for expressing a S. 
pneumoniae gene or fragment thereof in a variety of 
procaryotic host cells, such as E. coli, is easily made. A 
suitable parent vector contains an origin of replication 

15 (Ori) , a marker for selecting trans formants, for example, an 
ampicillin resistance gene (Amp) , and further comprises 
suitable transcriptional and translational signals, for 
example, the T7 promoter and T7 terminator sequences, in 
operable-linkage to a S. pneumoniae coding region. For 

20 example, pETHA (obtained from Novogen, Madison WI) is 

linearized by restriction with endonucleases Ndel and BamHI. 
Linearized pETHA is ligated to a DNA fragment bearing Ndel 
and BamHI sticky ends and comprising a coding region for a 
S. pneumoniae ORF. 

25 The ORF used in this construction may be modified 

at the 5 1 end (amino terminus of encoded protein or peptide) 
to simplify purification of the encoded protein or peptide. 
For this purpose, an oligonucleotide encoding 8 histidine 
residues is inserted after the transcriptional and 

30 translational start sites. Placement of the histidine 
residues at the amino terminus of the encoded protein 
enables the IMAC one-step protein purification procedure. 

Example2 




# 
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Recombinant Expression and Purification of a Protein Encoded 



S. pneumoniae genome, as disclosed in Example 1, and which 
5 ORF is operably-linked to an expression promoter, is 
transformed into E. coli BL21 (DE3) {hsdS gal lclts857 
indlSajn7nin51acUV5-T7gene 1) using standard methods. 
Transformants, selected for resistance to ampicillin, are 
chosen at random and tested for the presence of the vector 
10 by agarose gel electrophoresis using quick plasmid 

preparations. Colonies that contain the vector are grown in 
L broth and the protein produced by the vector-borne ORF is 
purified by IMAC, essentially as described in US Patent 
4,569,794. 

15 Briefly, the IMAC column is prepared as follows. A 

metal-free chelating resin (e.g. Sepharose 6B IDA, 
Pharmacia) is washed in distilled water to remove 
preservatives and then infused with a suitable metal ion 
[e.g. Ni(II), Co (II), or Cu(II)] by adding a 50mM metal 

20 chloride or metal sulfate aqueous solution until about 75% 
of the interstitial spaces of the resin are saturated with 
colored metal ion. The column is then ready to receive a 
crude cellular extract containing the recombinant protein 
product . 

25 Unbound proteins and other materials are removed by 

washing the column with any suitable buffer, pH 7.5. Bound 
protein is eluted in any suitable buffer at pH 4.3, or 
preferably with an imidizole-containing buffer at pH 7.5. 

30 Example 3 

DNA Chip Production 
Any one or more of the S. pneumoniae genome DNA 
fragments disclosed herein, or fragments thereof, are arrayed 
onto a solid support. It is preferred that fragments be in 



by a 5. pneumoniae ORF 
An expression vector that carries an ORF from the 
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the size range of 14 base pairs to 500 base pairs. The DNA 
samples are most conveniently synthesized by PCR using 
standard methods to amplify regions disclosed by the genomic 
sequences herein. The method of Schena et al. is used to 
5 spot about 1 ng to 10 ng of a DNA sample onto glass 

microscope slides that have been treated with poly-L-lysine 
(M. Schena et al. "Quantitative monitoring of gene 
expression patterns with a complementary DNA microarray" 
Science, 270, 467-470, 1995) . After spotting DNA samples 
10 onto the chip and air-drying, the chips are rehydrated by 
incubation for about 2 hours in a humid chamber. Chips are 
then placed at 100° C for 1 minute, rinsed in 0.1% SDS, and 
treated with 0.05% succinic anhydride in 50% l-methyl-2- 
pyrrolidinone and 50% boric acid. 

15 

Example 4 

S. pneumoniae Gene Expression Analysis using DNA Chips 
RNA prepared from cells grown under any desireable 
conditions is used to prime cDNA synthesis by reverse 
transcription, using methods well known to the skilled 
artisan (See e.g. Molecular Cloning , 2d Ed. J.Sambrook, E. 
Fritsch, T. Maniatis, 1989). For example, total RNA of 
strain R6 is prepared according to the method of Logeman 
et.ai., [Analytical Biochemistry, 1987, 163, 16-20) using 
guanidine hydrochloride. After ethanol precipitation, the 
total RNA is dissolved in a buffered solution such as Tris- 
EDTA (TE) . Complementary DNA 1 s are synthesized with the aid 
of the StrataScript RT-PCR kit (Stratagene, Inc. ) in 
accordance with the supplier's recommendations (See Schena 
et ai. Id.). Briefly, a 50 ul reaction contains about 0.1 
ug/ul of RNA. First strand synthesis is primed using random 
primers, IX first strand buffer, 0.03 U/ul ribonuclease 
block, 500 uM dATP, 500 uM dTTP, 40 uM dGTP, 40 uM 
fluorescein-12-dCTP (New England Nuclear), and 0.03 U/ul 




25 
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reverse transcriptase. Reactions are incubated for 60 
minutes at 37° C, precipitated with ethanol, and resuspended 
in 10 ul TE pH 8. Samples are heated for 3 minutes at 94° C 
and chilled on ice. The RNA is degraded by adding 0.25 ul of 
5 10 N NaOH, followed by a 10 minute incubation at 37° C. The 
samples are neutralized with 2.5 ul of 1M Tris-HCl, pH 8 and 
0.25 ul of 10 N HC1. After ethanol precipitation, the 
nucleic acid pellet is washed and dried in vaccuo. 

Prior to hyrbrization, DNA chips prepared as in Example 

10 3 are denatured by heating to 90°C for 2 minutes. 

Hybridization reactions contain about 1 ul of f luorescently- 
labeled cDNA, and 1 ul of hybridization buffer (lOx SSC and 
0.2% SDS) . Probe mixtures are transferred to the surface of 
the chip, covered with a cover slip, and incubated for 18 

15 hours at 65° C. Chips are washed 5 minutes at room 

temperature in IX SSC, 0.1% SDS, then for 10 minutes at room 
temperature in 0.1X SSC, 0.1% SDS. After hybridization, 
chips are scanned with a laser-scanning device. 
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Example 5 

A DNA Bio Chip for mutation analysis 
Duplicate DNA chips are prepared as in Example 3. Each 



chip is overlayed with S. pnevmoniae cells in a semi-solid 
5 medium, wherein said cells carry a temperature-sensitive 
(ts) mutation in a gene required for autolytic activity 
(Lyt~) . This mutation leads to resistance to lysis at 37° C, 
but sensitivity to lytic treatments at 30° C. 



10 detergent and penicillin when grown at 37° C, but remains 
sensitive when grown at 30° C (cwl is derived from strain 
R6; See P. Garcia et ai. "Mutants of Streptococcus 
pneumoniae that contain a temperature-senstive autolysin" J. 
Gen. Microbiol. 132, 1401-05, 1986). Strain cwl is grown at 

15 30° C and competent cells are prepared according to any 
suitable method (e.g. LeBlanc et.al. Plasmici 28, 130-145, 
1992; Pozzi et al. J. Bacteriol . 178, 6087-6090, 1996). 
Competent cwl cells are harvested by centrifugation and 
resuspended at about 10 5 cells per ml in 1% melted agar 

20 supplemented with 0.1% (w/v) yeast extract (Difco) and 

containing 1% to 2% Triton X-100. Approximately 100 ul to 
500 ul of the cell mixture is deposited per square 
centimeter onto the bio chip by pipetting onto the chip 
surface. After solidification of the agar layer, one of the 

25 bio-chips is incubated at 37° C and the other at 30° C. 

Cells that take up a complementing genomic DNA fragment from 
the chip surface will be lysed at both 30° C and 37° C, 
while non-complemented cells are lysed only at 30° C. Cells 
that are complemented by the bio-chip are recognizable by 

30 this phenotypic difference and can be further purified by 
well known methods. 



S. pneumoniae strain cwl is resistant to lysis by 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Baltz, Richard H. 

Burgett, Stanley G. 
DeHoff, Bradley S. 
Jaskunas Jr., Stanley R. 
Mills, Bradley J. 
Norris, Franklin H. 
Peery, Robert B. 
Rosteck Jr., Paul R. 
Skatrud, Paul L. 
Smith, Michele C. 
Rockey, Pamela K. 
Young-Bellido, Michele 

(ii) TITLE OF INVENTION: Streptococcus Pneumoniae DNA Sequences 

(iii) NUMBER OF SEQUENCES: 122 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Eli Lilly and Company 

(B) STREET: Lilly Corporate Center 

( C ) CITY: Indianapolis 

(D) STATE: Indiana 

(E) COUNTRY: U.S. 

(F) ZIP: 46285 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC- DOS/MS-DOS 

<D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Webster, Thomas D. 

(B) REGISTRATION NUMBER: 39,872 

(C) REFERENCE/ DOCKET NUMBER: X-11162 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: 317-276-3334 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1267 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



ATGGTGGAAG 


TTCCAGATGA ACGCCTACAA AAACTAACTG AAATGATAAC TCCTAAAAAG 


C A 

bu 


ACAGTTCCCA 


CAACATTTGA ATTTACAGAT ATTGCAGGGA TTGTAAAAGG AGCTTCAAAA 


120 


GGAGAAGGGC 


TAGGGAATAA ATTCTTGGCC AATATTCGTG AAGTAGATGC GATTGTTCAC 


180 


GTAGTTCGTG 


CTTTTGATGA TGAAAATGTA ATGCGCGAGC AAGGACGTGA AGACGCCTTT 


240 


GTAGATCCAC 


TTGCAGATAT TGATACAATT AATCTGGAAT TAATTCTTGC TGACTTAGAA 


300 


TCAGTGAACA 


AACGATATGC GCGTGTAGAA AAGATGGCAC GTACGCAAAA AGATAAAGAA 


360 


TCAGTAGCAG 


AATTCAATGT TCTTCAAAAG ATTAAACCAG TCCTAGAAGA CGGGAAATCA 


420 


GCTCGTACCA 


TTGAATTTAC AGATGAGGAA CAAAAGGTTG TCAAAGGTCT TTTCCTTTTG 


480 


ACGACTAAAC 


CAGTTCTTTA TGTAGCTAAT GTGGACGAGG ATGTGGTTTC AGAACCTGAC 


540 


TCTATCGACT ATGTCAAACA AATTCGTGAA TTTGCAGCGA CAGAAAATGC TGAAGTAGTC 


600 


GTTATTTCTG 


CGCGTGCTGA GGAAGAAATT TCTGAATTGG ATGATGAAGA TAAAAAAGAG 


c c rt 


TTTCTTGAAG 


CCATTGGTTT GACAGAATCA GGTGTAGATA AGTTGACGCG TGCAGCTTAC 


720 


CACTTGCTTG 


GATTGGGAAC TTACTTCACA GCTGGTGAAA AAGAAGTTCG CGCTTGGACT 


*7 fin 


TTCAAACGTG 


GTATGAAGGC TCCTCAAGCA GCTGGTATTA TCCACTCAGA CTTTGAAAAA 


840 


GGCTTTATTC 


GTGCAGTAAC CAT GT CAT AT GAAGATCTAG TGAAATACGG ATCTGAAAAG 


900 


GCCGTAAAAG 


AAGCTGGACG CTTGCGTGAA GAAGGAAAAG AATATATCGT TCAAGATGGC 


960 


GATATCATGG AATTCCGCTT TAATGTCTAA AAATTAATAA ATGGTGTCAA TTAGGTTGGA 


1020 


AAAAAATTCC 


AACCCTTTTG GCTTTTGAAA GGAAAAATAA ATGACCAAAT TACTTGTAGG 


1080 


TTTGGGAAAT 


CCAGGGGATA AATATTTTGA AACAAACACA ATGTTGGTTT TATGTTGATT 


1140 


GATCAACTAG 


CGAAGAAACA GAATGTCACT TTTACACACG ATAAGATATT TCAAGCTGAC 


1200 


CTAGCATCCT 


TTTTCCTAAA TGGAGAAAAA ATTTATCTGG GTTAAACCAA CGACCTTTAT 


1260 


GGATTGA 




1267 


(2) INFORMATION FOR SEQ ID NO:2: 





(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1255 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



10 
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(iii) HYPOTHETICAL : NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TGGTCCGTGG TGCTGAGGAC CCTTAGAGTT CGAGTACCAC AAGGTACCGA CTGTTCGTGA 60 

TGCGGAGCTG GCAAGGTTTT AACAGATTTG ATTGAACATG GGCAAGAATT TATCGTTGCC 120 

15 CACGGTGGTC GTGGTGGACG TGGAAATATT CGTTTCGCGA CACCAAAAAT CCTGCACCGG 180 

AATCTCTGAA AATGGAGAAC CAGGTCAGGA ACGTGAGTTA CAATTGGAAC TAAAAATCTT 240 

GGCAGATGTC GGTTTAGTAG GATTCCCATC TGTAGGGAAG TCAACACTTT TAAGTGTTAT 300 

20 

TACCTCAGCT AAGCCTAAAA TTGGTGCCTA CCACTTTACC ACTATTGTAC CAAATTTAGG 360 

TATGGTTCGC ACCCAATCCA GGTGAATCCT TTGCAGTAGC CGACTTGCCA GGTTTGATTG 420 

25 AAGGGGCTAG TCCAAGGTGT TGGTTTGGGA ACTCAGTTCC TCCGTCACAT CGAGCGTACA 480 

CGTGTTATCC TTCACATCAT TGATATGTCA GCTAGCGAAG GCCGTGATCC ATATGAGGAT 540 

TACCTAGCTA TCAATAAAGA GCTGGAGTCT TACAATCTTC GCCTCATGGA GCGTCCACAG 600 

30 

ATTATTGTAA CTAATAAGAT GGACATGCCT GAGAGTCAGG AAAATCTTGA AGAATTTAAG 660 

AAAAAATTGG CTGAAAATTA TGATGAATTT GAAGAGTTAC CAGCTATCTT CCCAATTTCT 720 

35 GGATTGACCA AGCAAGGTCT GGCAACACTT TTAGATGCTA CAGCTGAATT GTTAGACAAG 780 

ACACCAGAAT TTTTGCTCTA CGACGAGTCC GATATGGAAG AAGAAGTTTA CTATGGATTT 840 

GACGAAGAAG AAAAAGCCTT TGAAATTAGT CGTGATGACG ATGCGACATG GGTACTTTCT 900 

40 

GGTGAAAAAC TCATGAAACT CTTTAATATG ACCAACTTTG ATCGTGATGA ATCTGTCATG 960 

AAATTTGCCC GTCAGCTTCG TGGTATGGGG GTTGATGAAG CCCTTCGTGC GCGTGGAGCT 1020 

45 AAAGATGGGG ATTTGGTCCG CATTGGTAAA TTTGAGTTTG AATTTGTAGA CTAGGAGACT 1080 

GGTATGGGAG ATAAACCGAT ATCTTTCCGA GATGCGGATG GTAATTTTGT TTCCGCCGCA 1140 

GACGTTTGGA ATGAAAAGAA ATTGGAAGAA CTATTTAATC GTCTCAATCC AAATCGTGCC 1200 

50 

TTGAGATTGG CACGAACTAC AAAGGAAAAT CCATCTCAGT AAAGAAGCTA AAAAA 1255 
(2) INFORMATION FOR SEQ ID NO: 3: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1609 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



60 



(ii) MOLECULE TYPE: DNA (genomic) 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

10 





TTACCCATCG 


CATGACTAAA AATCTCTACT 


ATCCAATACT AGTTCATATT 


CTCATCAATA 


60 




TCACTGCCTT 


CTGGGATGTT 


TGGTTACTCC 


TATTTTCAGG AAGTTAGCTT 


ACTAAAAAAA 


120 


id 


TGTCGGAATT 


TTCCGGCATT 


TTCTTTTTTC ACAAATAGTC AACGTTTTTC 


TTTCCGATAC 


180 




TGAAGTGGTG 


TGTAGCCACT 


TATTTTTTTG AATTGATTTT GAAAATAAGA 


TTGGCGTGAG 


240 


20 


AAAGGCAGAT 


AGTGAAGATA 


GTTAAGAAGA ATAGGATGTT CTTTTTTCCT 


TTTTGGAAAA 


300 


CTTCTAAAAT 


ATGGTATAAT 


GAAAAGATAA AGAAGTTGGG GGTAGAAGAT 


GAACATTCAA 


360 




CAATTACGCT 


ATGTTGTGGC 


TATTGCCAAT AGTGGTACTT TTCGTGAAGC 


TGCTGAAAAG 


420 


Z D 


ATGTATGTTA 


GTCAGCCGAG 


TCTGTCTATT 


TCTGTTCGTG ATTTGGAAAA AGAGTTGGGC 


480 




TTTAAGATTT 


TCCGTCGGAC 


CAGCTCAGGG 


ACTTTCTTGA CCCGTCGTGG 


GATGGAATTT 


540 


30 


TATGAAAAAG 


CGCAAGAATT 


GGTTAAAGGA 


TTTGATATTT TTCAAAATCA 


GTATGCCAAT 


600 


CCTGAAGAAG AAAAAGATGA ATTTTCCGTT 


GCTAGCCAGC ACTATGACTT 


CTTACCACCA 


660 




ACTATTACGG 


CCTTTTCAGA 


GCGCTATCCT 


GACTATAAGA ACTTCCGTAT 


TTTTGAATCA 


720 


o _> 


ACTACTGTTC 


AAATATTAGA 


TGAAGTGGCG 


CAAGGGCATA GTGAGATTGG 


GATTATCTAC 


780 




CTCAACAATC 


AAAATAAAAA GGGGATTATG 


CAACGGGTTG AAAAGTTAGG 


TCTGGAGGTC 


840 


40 


ATCGAATTGA 


TTCCTTTCCA 


TACCCATATT 


TATCTCTGTG AGGGTCATCC 


TTTAGCCCAG 


900 


AAAGAGGAAT 


TAGTCATGGA 


GGATTTAGCG 


GATTTACCAA CGGTTCGTTT 


CACTCAAGAG 


960 




AAAGACGAGT 


ACCTTTATTA TTCAGAGAAC 


TTTGTCGATA CCAGCGCTAC 


TCACAGATGT 


1020 


45 


TTAATGTGAC 


AGACCGTGCC 


ACCTTGAATG 


GTATTTTGGA GCGGACGGAC. 


GCCTATGCGA 


1080 




CAGGTTCTGG 


ATTTTTAGAT 


AGTGACAGTG 


TTAATGGCAT TACAGTTATT 


CGTCTCAAGG 


1140 


50 


ATAACCTAGA TAACCGCATG 


GTCTATGTTA AACGTGAAGA AGTGGAGCTT AGTCAAGCTG 


1200 


GGACTCTCTT 


CGTAGAAGTC ATGCAAGAAT ATTTTGATCA AAAGAGGAAA TCATGAAAAA 


1260 




AAGAGCAATA 


GTGGCAGTCA TTGTACTGCT 


TTTAATTGGG CTGGATCAGT 


TGGTCAAATC 


1320 


55 


CTATATCGTC 


CAGCAGATTC 


CACTGGGTGA AGTGCGCTCC TGGATTCCCA ATTTCGTTAG 


1380 




CTTGACCTAC 


CTGCAAAATC 


GAGGTGCAGC 


CTTTTCTATC TTACAAGATC 


AGCAGCTGTT 


1440 


60 


ATTCGCTGTC 


ATTACTCTGG 


TTGTCGTGAT 


AGGTGCCATT TGGTATTTAC 


ATAAACACAT 


1500 


GGAGGACTCA TTCTGGATGG TCTTGGGTTT GACTCTAATA ATCGCGGGTG 


GTCCTGGAAA 


1560 
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CTTTATTGAC AGGGTCAGTC AGGGCTTTGT TGTGGATATG TTCCACCTT 1609 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 763 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GTAACTTAGG GCCCCAAGTC CATAACTTGC TTGACGCATG CTATCACTAA CAGATAAAAG 60 

GGCTTCTTCT GTTGAGCGAA TAATGACTGG CAACACCATG ATGACTGAGG TTAAGATTCC 120 

TGATAACAGA GAGTATTGAA AACCTAAGAA GACTACAAAG AAGAGCATGC CAAACAGACC 180 

AAAAACAATG GAAGGAATCC CAGACAAGGT ATCTGAGGCC AATCGCATGA TTTTAACACA 240 

AAGGGAATCT TTTTTTGTAT ATTCCACAAG ATAAAAACCA GCAAAAATCC CTATGGGCAA 300 

GGCTAAAAGA AGAGCACCAA AGACCAGAAT AACGGTGGAA ATAATCGCTG GCATAAGGGA 360 

AATGTTCTCA GAAGTATAAG TCCAAGAAAA GAGGGATAGA CTTAGATGAG GTAAACCTTT 420 

GAT GAG GAT A -AAACCAATGA TTAAAAAGAG AGAGCCAAAG GTTAAAGCTG AAAAACAATA 480 

AACGAGAAGT TTTAGCAGGT ATTTACTCAT AAGATGATTT TCCTTTCAAG TAGCCAAAGT 540 

AGGCATTAAT CAAGAGAATA AGGAAAAAGA GAACTGCTGA GGTTGCAATA AGGGCTTCCC 600 

TATGCTGACC TGATGCGTAA GCCATTTCCA GAACAATATT GGTTGTTAAG GTTCTGGTTC 660 

CTGAAAAGAG TCCACTTGGA ATAATCGGCT GGTTGCCTGC CACCAAAATA ACTGCCATGG 720 
TTTCACCTAC TGCGCGACCG ATGCCTAAAA TAACTGCTGA AAA 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 897 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



763 
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10 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GGGTCTGTTT TGGCCTTGGC GGCTTCAGGT GGTTCAGGAG CTTGGCAGGG AGCTGGTCTC 60 

ATGTTGGTGT ATACGCTGGG CTTGGCGCTA CCATTCTTGC TTCTAGCTCT GACCTCTAGT 120 

TATGTTTTGA AACATTTCCG AAAACTTCAT CCCTATCTCG GAATCCTCAA AAAAGTGGGT 180 

GGTTTTCTCA TTATTGTGAT GGGATTCTTG GTTCTGTTTG GAAATGCTTC AATTTTAAGT 240 

15 CAATTATTTG AATAAAATGG AAAGGAATAT CAATATGAAA AAATGGCAAA CATGTGTTCT 300 

TGGAGCAGGT TCGCTCCTTT GTTTGACGGC TTGTTCAGGC AAGTCCGTGA CTAGTGAACA 360 

CCAAACGAAA GATGAAATGA AGACGGAGCA GACAGCTAGT AAAACAAGCG CACTAAAAGG 420 

GAAAGAGGTG GCTGATTTTG AATTGATGGG AGTAGATGGC AAGACCTACC GTTTATCTGA 480 

TTACAAGGGC AAGAAAGTCT ATCTCAAATT CTGGGCTTCT TGGTGTTCCA TCTGTCTGGC 540 

25 TAGTCTTCCA GATACGGATG AGATTGCTAA AGAAGCTGGT GATGACTATG TGGTCTTGAC 600 

AGTAGTGTCA CCAGGACATA AGGGAGAGCA ATCTGAAGCG GACTTTAAGA ATTGGTATAA 660 

GGGATTGGAT TATAAAAATC TCCCAGTCCT AGTTGACCCA TCAGGCAAAC TTTTGGAAAC 720 

TTATGGTGTC CGTTCTTACC CAACCCAAGC CTTTATAGAC AAAGAAGGCA AGCTGGTCAA 780 

AACACATCCA GGATTCATGG AAAAAGATGC AATTTTGCAA ACTTTGAAGG AATTATCCTA 840 

35 GGAGGCGTCT TATGAATGAT AAGTTAAAAA TCTTCTTGTT GCTAGGAGTA TTTTTTC 897 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 3499 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

50 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

55 

TTTCTTTTTC CTAGGTGATT TTAATGAGGT TGAAATTCAA AATGTATTAG AATCATTTGG 60 
CTTTAAAGGT CGAAAAGGAG ATGTGAAGGT TCAGTATTGT CAACCTTATT CTAATATCCT 120 
60 TCAGGAAGGT ATGGTTCGGA AAAATGTGGG ACAATCCATT TTGGAATTAG GTTATCATTA 180 
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CCGTTCTAAA TATGGTGATG AGCAACATTT ACCCATGATT GTAATGAATG GTTTACTTGG 240 

TGGATTTGCT CACTCTAAGC TCTTTACAAA TGTCCGTGAA AATGCTGGAT TAGCTTATAC 300 

CATTTCAAGT GAGCTTGATT TATTTAGTGG ATTCTTGAGG ATGTATGCTG GTATCAATCG 360 

AGAAAATCGT AACCAGGCTC GTAAAATGAT GAATAATCAA CTGCTTGATT TAAAAAAAGG 420 

TTATTTTACA GAGTTTGAGT TAAATCAGAC CAAGGAAATG ATTCGTTGGT CGTTGTTACT 480 

TTCTCAAGAT AATCAATCTT CATTGATTGA ACGTGCTTAT CAAAATGCCT TATTTGGAAA 540 

ATCTTCAGCA GACTTTAAAA GTTGGATTGC AAAGCTTGAA CAAATTGACA AAGATGCTAT 600 

TTGTAGAGTA GCTAATAATG TGAAACTACA AGCGATTTAC TTTATGGAAG GAATAGAATG 660 

ACAAAGGTTG TTTTTGAAGA AAAATACTAT CCAGCTGTAA AAGAAAAGGT TTATCGAACT 720 

CGTTTGGCCA ACGGATTGAC AGTTGCTCTT TTGCCTAAAA AGGAATTTAA AGAGGTTTAC 780 

GGGAGTGTCA CTGTACAGTT TGGTTCGGTA GATACGTTTG TCACAGAAGT TGACGGATAT 840 

GTAAAACAAT ATCCTGGAGG AATTGCTCAT TTTCTTGAAC ATAAATTATT TGAGAGAGAA 900 

GATTCTAGTG ATTTGATGTC GGCTTTTACG AGTCTAGGTG CAGATAGTAA TGCCTTTACA 960 

AGCTTTACAA AAACAAACTA TCTTTTTTCA GCAACGGATT ATTTTTTAGA AAATTTAGAT 1020 

TTACTTGATG AATTGGTAAC ATCAGCACAC TTTACTGAAG CTTCCATTCT GACAGAGCAG 1080 

GATATTATTC AGCAAGAACG AGAAATGTAC CAAGATGATC CAGATTCGTG TTTATTCTTT 114 0 

TCAACTTTAG CGAATTTGTA TCCTGGTACA CCTTTAGCAA CTGATATAGT TGGAAGTGAG 1200 

GAGTCCATTT CCCAAATCAA TCTAACTAAT TTGCAAGAAA ATTTTACAAA GTTTTACAAA 1260 

CCTGTAAACA TGTCTCTGTT TTTAGTTGGT AATTTTGATG TGGAGCGAGT ACAGGACTAT 1320 

TTTGAAAGCA AAGAACTGAA AGATTCAGAT TTTCAGGAAG TAGCAAGAGA AAAGTTGTTT 1380 

TTACAGCCTG TAAAGCCAAC AGATAGTATG AGAATGGAAG TATCTTCTCC CAAACTAGCG 1440 

ATTGGAGTTA GAGGTAAGCG AGAAGTTTCT GAAGCGGATT GCTATCGACA TCATATTTTA 1500 

TTAAAATTAT TGTTTGCAAT GATGTTTGGT TGGACTTCGG GATCGTTTTC AAAAATGTTA 1560 

TGAATCAGGT AAAATTGATG CGTCCTTATC TCTGGAAGTT AAATAACAAG TCGCTTTCAT 1620 

TTTGTCATGT TGACAATAGA TACGAAAGAG CCAGTTGCTT TGTCTCATCA ATTTAGGAAG 1680 

GCTATTCGTA ATTTTACAAA GGATTTAGAT ATTACAGAGG AACATTTAGA TATTATCAAA 1740 

AGAGAGATGT TTGGCGAATT TTTCAGTAGC ATGAACTCTC TTGAATTTAT TGCAACGCAA 1800 

TATGATGCTT TTGAAAATGG TGAGACAATT TTTGATTTGC CGAAAATTTT ACAGGAAATT 1860 

ACTTTAGAGG ATGTCCTTGA TGCTGGACAT CATTTAATAG ATGATGGTGA CATAGTTGAT 1920 

TTTACAATAT TCCCATCGTA GTAACCTATC ATAATAGACA CTAGAAAGAA GGGATGACAA 1980 

GTATGAGAAA AAAAACAATT GGAGAGGTTT TACGATTAGC TAGAATCAAT CAGGGATTGA 2040 
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GTTTAGATGA ATTGCAGAAA AAGACAGAAA TCCAGTTAGA TATGTTGGAA GCAATGGAAG 2100 

CAGACGATTT CGATCAACTT CCAAGTCCTT TTTACACGCG TTCTTTCTTG AAAAAATATG 2160 

5 

CATGGGCTGT TGAGTTAGAT GACCAAATTG TTTTGGATGC TTATGATTCT GGGAGTATGA 2220 

TTACTTATGA GGAAGTAGAT GTTGATGAAG ATGAGTTGAC AGGTCGTAGA CGTTCAAGTA 2280 

10 AGAAAAAGAA GAAAAAAACA TCATTTTTAC CTTTATTTTA TTTTATCCTT TTTGCTTTAT 2340 

CGATTTTAAT TTTTGTGACT TATTATGTTT GGAACTATAT TCAAACTCAA CCAGAGGAGC 2400 

CTTCTCTTTC TAATTACAGT GTGGTTCAAT CAACAAGTTC AACTAGCTCT GTTCCCCACT 2460 

15 

CCTCAAGTAG TAGTTCTTCT AGTATAGAAT CAGCTATAAG TGTATCAGGC GAAGGAAATC 2520 

ATGTAGAAAT CGCTTATAAG ACAAGTAAGG AAACAGTTAA ATTGCAATTG GCAGTTTCAG 2580 

20 ATGTTACAAG TTGGGTCAGT GTTTCAGAAA GCGAACTTGA GGGCGGTGTA ACCTTATCGC 2640 

CAAAGAAGAA AAGTGCAGAA GCAACAGTTG CAACTAAAAG TCCTGTAACA ATTACGTTAG 2700 

GTGTTGTAAA AGGTGTTGAT TTGACAGTAG ATAATCAGAC TGTTGATTTA TCGAAATTAA 2760 

25 

CAGCTCAGAC TGGACAAATC ACTGTAACCT TTACTAAAAA TTAAGGAAAA ACGAATGAAA 2820 

AAAGAACAAA TTCCCAATCT CTTAACAATA GGTCGAATTC TCTTTATACC TATTTTTATC 2880 

30 TTTATTTTAA CGATAGGAAA TTCGATAGAG AGTCATATAG TTGCAGCTAT TATCTTTGCT 2940 

GTTGCCAGTA TTACCGACTA TTTAGATGGA TATTTAGCTC GTAAATGGAA TGTGGTCAGT 3000 

AATTTTGGTA AATTTGCAGA TCCTATGGCG GATAAGTTAC TAGTTATGTC GGCTTTTATT 3060 

35 

ATGTTGATTG AGTTAGGTAT GGCTCCGGCT TGGATTGTTG CAGTGATTAT CTGTCGTGAG 3120 

TTAGCTGTGA CAGGTTTAAG GCTTTTATTG GTTGAAACTG GTGGAACAAT TTTAGCAGCA 3180 

40 GCAATGCCTG GAAAAATTAA AACTTTTAGT CAGATGTTTG CTATTATTTT CTTGCTATTA 3240 

CATTGGACTT TGCTTGGTCA AGTTCTACTT TATGTAGCCT TATTTTTCAC TATCTACTCT 3300 

GGCTATGACT ATTTCAAGGG TAGTGCCTAT GTATTTAAAG GGACATTTGG TTCGAAATGA 3360 

45 

AATCAATAAT TGATGTAAAA AATCTTTCTT TTCGCTATAA AGAAAATCAG AACTACTACG 3420 

ATGTGAAGGA TATTACGTTT CACGTGAAAC GTGGAGAATG GCTTTCGATT GTAGGGCATA • 3480 

50 ATGGTAGTGG TAAATCAAC 3499 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 821 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



60 (ii) MOLECULE TYPE: DNA (genomic) 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

ATTTTTGAAT AATCAAGCGG AACCAAGAGG TCTTCGTCCT TCATCTTGTT AATCATGTAT 60 

TCACTTGGAA TGGCAATATC GTAGGTCGTT CCACCCTGCT TTATCTTAGT GTACATGGCT 120 

TCGTTGGAGT CAAAAGCCTC GTACTGAACT TGAATTCCTG TTTCTTCTGT AAACTGAGTC 180 

AAGAGTTCAG GATCGATATA GTCTCCCCAG TTATAGATAA CCAATTTTTG ACTATCTCGA 24 0 

CTATTGATTT TACTATCTAA ATGAGTCGCA ATTCCCCACA AGACAAGGAT AATCGCTGCA 300 

ATTCCTGCTA AAATGAATAG ATTTTTTTCA TGCTTGCTCC TCCTTCTCAC GAGAGATAAA 360 

GTAATAACCT ACAACTAGGA TAATACTAAA GAGAAAGACT AGAGCAGAAA GGGCATTGAT 420 

TTCTAGCGAA ATCCCCTTGC GAGCACGAGA GTAAATCTCG ACTGATAGGG TTGAAAAGCC 480 

ATTTCCTGTT ACAAAGAAGG TCACGGCAAA GTCATCTAAC GAATAGGTGA AGGCCATGAA 540 

ATAACCAGCA ATGATAGACG GAGTCAGGTA AGGAAGCATG ATTTCCTTAA ACATCTGAAA 600 

TTGACTAGCT CCCAAGTCAT AGGCCGCATG AATCATGTCG CCATTCATTT CCTTGAGTCG 660 

GAGGCAAGAC CATCAAGACC ACGATAGGAA TGGAGAAGGC CACATGACTA GATAGAACGG 720 

TCAAAAAGCC AAGTGAAAAC TTGAGTTGGG TAAAGAGAAT CAAGAAGTAG CACCAATCAT 780 

AACGTCAGGC GCAACCATGA GGATATTATT GAGTGATAGA A 821 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1309 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGCCGGTGCC ACAGTCCAAG CTATCGGTAT CGTGATTGAG AAATCCTTCC AAGATGGTCG 60 

TGATTTGCTT GAAAAAGCAG GCTACCCTGT CCTATCACTT GCTCGCTTGG ATCGTTTTGA 120 

AAATGGTCAG GTCGTATTTA AGGAGGCAGA TCTCTAATGC AAACTCAAGA AAAACACTCG 180 
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CAAGCAGCCG TTCTCGGCTT GCAGCACTTA CTAGCCATGT ACTCAGGATC TATCCTGGTT 240 

CCCATCATGA TTGCGACAGC CCTTGGCTAT TCAGCTGAGC AGTTGACCTA CCTGATTTCT 300 

5 

ACAGATATCT TCATGTGTGG GGTGGCAACC TTCCTCCAAC TCCAACTCAA CAAATACTTT 360 

GGGATTGGAC TCCCAGTCGT TCTTGGAGTT GCATTCCAGT CGGTCGCTCC CTTGATTATG 420 

10 ATTGGGCAAA GCCATGGTAG TGGCGCTATG TTTGGTGCCC TTATCGCATC TGGGATTTAC 480 

GTGGTTCTTG TTTCAGGCAT CTTCTCAAAA GTAGCCAATC TCTTCCCATC TATCGTAACA 540 

GGATCTGTTA TTACCACGAT TGGTTTAACC TTGATCCCTG TCGCTATTGG AAATATGGGA 600 

AATAACGTTC CAGAGCCAAC TGGTCAAAGT CTCTTGCTTG CAGCTATTAC TGTTCTGATT 660 

ATCCTCTTGA TCAACATCTT TACCAAAGGA TTTATCAAGT CTATCTCTAT TTTGATTGGT 720 

20 CTGGTTGTTG GAACTGCCAT TGCTGCTACT ATGGGCTTGG TGGACTTCTC TCCTGTTGCG 780 

GTAGTCCACT TGTCCATGTC CCAACTCCAC TCTACTTTGG GATGCCAACC TTTGAAATCT 840 

CATCTATTGT CATGATGTGT ATCATCGCAA CGGTGTCTAT GGTTGAGTCA ACTGGTGTTT 900 

ATCTAGCCTT GTCTGATATC ACAAAAGATC CAATCGACAG CACGCGCCTT CGCAACGGTT 960 

ACCGCGCAGA AGGTTTGGCG GTACTTCTCG GAGGAATCTT TAACACCTTC CCTTACACCG 1020 

30 GATTTTCACA AAACGTTGGT TTGGTTAAAT TGTCAGGCAT CAAAAAACGC CTGCCAATCT 1080 

ACTACGCAGG TGGTTTCCTG GTTCTCCTTG GACTGCTTCC TAAGTTTGGT GCCCTTGCCC 1140 

AAATCATTCC AAGCTCCGTC CTCGGCGGTG CCATGCTGGT GATGTTTGGT TTTGTATCTA 1200 

TTCAAGGGAT GCAAATCCTC GCCCGAGTTG ACTTTGTAAC AATGAACACA ACTTCCTTAT 1260 

CGCAGTGTTT CAATCGCTGC AGGTGTCGGT CTCAACAACA AGTAATCTC 1309 
40 (2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1031 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
50 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



25 



55 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
TTAAAGTTCC AGTTTATCTA GGTTCTTCAT TTGCCTTTAT CACAGCTATG TCACTGGCTA 60 
TGAAAGAAAT GGGGGGTGAT GTATCTGCTG CCCAAACAGG GGTTATCTTG ACTGGTTTGG 120 
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TCTATGTCCT TGTTGCTACC AGCATCCGAT TTGTAGGAAC AAAATGGATT GATAAACTCT 180 
TGCCACCAAT CATTATCGGT CCTATGATCA TCGTTATCGG TCTTGGACTT GCAGGTTCAG 240 

5 

CTGTTACCAA TGCAGGTCTT GTAGCAGACG GAAATTGGAA AAATGCTCTG GTAGCCGTTG 300 
TTACTTTCCT AATTGCTGCC TTTATCAATA CAAAAGGAAA AGGCTTCCTA CGAATCATTC 360 
10 CATTCCTCTT TGCCATTATC GGTGGTTACC TTTTCGCACT AACTCTTGGC TTGGTTGACT 420 
TTACACCAGT TCTTAAAGCC AACTGGTTCG AAATTCCTGG TTTCTACTTG CCATTTAGCA 480 
CAGGTGGTGC CTTTAAAGAG TACAATCTTT ACTTTGGTCC AGAAGCCATC GCTATCTTGC 540 

15 

CAATCGCTAT CGTAACAATT TCTGAACATA TCGGAGACCA TACTGTTTTG GGTCAAATCT 600 
GTGGCCGTCA ATTCTTAAAA GAACCAGGTC TTCATCGTAC TCTTCTTGGT GACGGTATCG 660 
20 CAACTTCTGT TTCTGCCTTC CTTGGTGGAC CAGCCAATAC AACTTACGGA GAAAATACAG 720 
GGGTTATCGG TATGACTCGT ATCGCTTCTG TCTCAGTTAT CCGTAACGCT . GCCTTCATCG 780 
CGATTGCCCT CAGCTTCCTT GGTAAATTCA CTGCCTTGAT TTCAACTATT CCAAACGCTG 840 

25 

TACTTGGTGG TATGTCAATC CTTCTCTATG GGGTTATCGC CAGCAATGGT TTGAAAGTCT 900 
TGATTAAAGA ACGTGTTGAT TTCGCTCAAA TGCGAAACCT CATCATCGCA AGTGCTATGT 960 
30 TGGTTCTTGG ACTTGGAGGA GCTATCCTTA AACTTGGTCC AGTACACTTT CAGGTACTGC 1020 
CCTTTCAGCC A 1031 
(2) INFORMATION FOR SEQ ID NO: 10: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 568 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

45 

(iv) ANTI-SENSE: NO 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
ACAGTTTAAT CATTGCCTTG GCTACAACCC TCATTGCGAT TATTATTTCT GCTATGGCAG 60 
55 CCTATGGTAT TGTTCGATTC TTTCCTAAAT TGGGAGCAAT CATGTCGAGA CTACTCGTCA 120 
TTACCTACAT TTTCCCACCA ATTTTGTTAG CAATTCCCTA TTCAATTGCC ATTGCTAAAG 180 
TTGGGTTAAC AAATAGTTTA TTTGGCTTGA TGATGGTTTA TCTATCTTTT AGTGTTCCAT 240 

60 

ATGCAGTTTG GCTCTTAGTT GGATTTTTCC AAACAGTTCC AATTGGAATT GAAGAAGCGG 300 
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CTAGAATTGA TGGTGCAAAT AAATTTGTTA CGTTTTATAA AGTTGTGCTA CCGATTGTAG 360 

CACCAGGTAT TGTAGCAACA GCTATTTATA CATTTATCAA TGCTTGGAAT GAATTTCTGT 420 

5 

ATGCCTTGAT TTTGATTAAC AATACAGGAA AGATGACAGT AGCAGTAGCC CTTCGTTCAC 480 

TTAATGGTTC AGAAATACTA GACTGGGGAG ATATGATGGC AGCGTCTGTT ATTGTAGTTC 540 

10 TTCCATCAAT TATTTCTTCT CTATCACC 568 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 468 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 

30 

ACCAAAGACT TAGCTTCTTC AAAAAGCGGA TCACCACCAG CATCTCCATC CGAAAATTCT 60 

CCTTCATTTT CAGAAACCTC ACCTGGATCA AAACTCTCAT CGTAGTCTGC ATCTGCCTGA 120 

35 GTCTTGATGA AGTTCACAAT GCGCTCAACA TCGTCATCCG AGATAAAGGA GCCTTGGAGA 180 

CGAACTGGAT GATTTTCATC AATCGGTTTA AAGAGCATGT CTCCTCGACC AAGAAGTTTT 240 

TCTGCTCCAT TTTCATCCAA AATCGTACGG GAGTCTGTTC CTGATGAAAC CGCAAATGCT 300 

40 

ACACGAGATG GAACATTGGC CTTAATCAAA CCAGAGATGA CATCAACAGA TGGACGCTGA 360 

GTTGCAAGAA TCATGTGGAT ACCTGCAGCA CGCGCCTTCT GCCCAAGACG GATGATAGCA 420 

45 TCTTCCACTT CCTTGCTGGC CACCATCATG AGGTCAGCCA ACTCATCC 468 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 466 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

55 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

60 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

AAGCTGACAA TCTTTTCTGC AGTTGGAGCA TCCCAGAAGG ATACACCACT AAGGATGCGA 60 

CCTGCCTTGC TATCAACAAT AATGTCTTGA ACCTTGTAGT CATCTCCATA GACCAAGAAC 120 

CATTCGTTGG TACAATCTTC ACGATAAACA CTAAAATAAG TCGAACGAGT CAAATCATTG 180 

CGGAACATAT TTTTAAAGAG ATAGTTATCT GCATCAATAA CATAGCTGTT GGCCAATTCT 240 

TCTTTTACAA GATAGAGAGA GTAAAAGTTA TTGTAGTCAG CGTATTTATC ATTGAAAACG 300 

15 AGACGAACAC CGTATTTCTC TTTCAAGTAA TCGAATTGTT CTTTAAGATA ACCAACAATG 360 

ATGATGATGT CATTGATTCC TTTTTCTTTG AGAAACTCAA TTTGGTACTC AATCAAAGGT 420 

TTTTGATTAA CCTGAACCAA GGCTTTAGGG GTATTTTCAG TCATAG 466 
(2) INFORMATION FOR SEQ ID NO: 13: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1040 base pairs 
25 (B) TYPE: nucleic acid 

(C} STRANDEDNESS: single 
(D) TOPOLOGY: linear 



30 



35 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

40 CACATAATCT GTATATTGAC TATAAGTTTT AAAAAACAAT TTTTAAGCTC TTCCTTGTCT 60 

TCTCTAACCA AGCGTGTTAT AATGAATACT GCTCAAGCGA CCTTCAATCG TGAAGCACAC 120 

ACGACCTTCA ATCGTGAATA AACGAATAGA TGGGAGACTT ACCATGAGTG ATAACTCTAA 180 

45 

AACACGTGTT GTCGTGGGGA TGAGTGGTGG TGTTGATTCG TCGGTGACGG CTCTTTTGCT 240 

CAAGGAGCAG GGCTACGATG TGATCGGTAT CTTCATGAAG AACTGGGATG ACACAGATGA 300 

50 AAACGGCGTC TGTACGGCGA CCGAAGATTA CAAGGATGTG GTTGCGGTGG CAGATCAGAT 360 

TGGCATTCCC TACTACTCTG TCAATTTTGA AAAAGAGTAC TGGGACCGCG TTTTTGAGTA 420 

TTTCCTAGCT GAATACCGTG CAGGGCGCAC GCCAAATCCG GACGTTATGT GCAACAAGGA 480 

55 

AATCAAGTTC AAGGCCTTTT TGGACTATGC CATGACCTTG GGGGCAGACT ATGTAGCGAC 540 

TGGGCACTAT GCTCGAGTGG CGCGTGATGA GGATGGCACT GTTCACATGC TTCGTGGCGT 600 

60 GGACAATGGC AAGGATCAGA CCTATTTCCT CAGCCAACTT TCGCAAGAAC AACTTCAAAA 660 



10 
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AACCATGTTC CCACTAGGAC ATTTGAAAAA GCCTGAAGTT CGAAAACTAG CAGAAGAAGC 720 

AGGTCTTTCG ACTGCTAAGA AGAAAGACTC GACAGGGATT TGCTTTATCG GAGAAAAGAA 780 

CTTTAAAAAC TTTCTCAGCA ACTACCTGCC AGCTCAGCCT GGTCGTATGA TGACTGTGGA 840 

TGGTCGTGAT ATGGGCGAGC ATGCTGGTCT TATGTACTAT ACAATCGGTC AGCGTGGCGG 900 

ACTCGGTATC GGTGGGCAAC ACGGTGGTGA CAATGCCCCT TGGTTCGTTG TCGGAAAAGA 960 

TCTAAGCAAG AATATTCTCT ATGTAGGCCA AGGTTTCTAC CATGATTCGC TCATGTCAAC 1020 

CACTAGAGGC TAGCCAAGTC 1040 
15 (2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3071 base pairs 

(B) TYPE: nucleic acid 

2 0 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

25 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CCGGGGATCT GATAGCCAAT AGAAAACCGC AGAGTCAAAG GGTTTTGTAT GAATTGCGAG 60 

35 

ATCGTTTGAA GAGAAATCAG TTTATACTCA ATGATACCAA TCCGGATATT GTCATTTCCA 120 
TTGGCGGGGA TGGTATGCTC TTGTCGGCCT TTCATAAGTA CGAAAATCAG CTTGACAAGG 180 
40 TCCGCTTTAT CGGTCTTCAT ACTGGACATT TGGGCTTCTA TACAGATTAT CGTGATTTTG 240 
AGTTGGACAA GCTAGTGACT AATTTGCAAC TAGATACTGG GGCAAGGGTT TCTTACCCTG 300 
TTCTGAATGT GAAGGTCTTT CTTGAAAATG GTGAAGTTAA GATTTTCAGA GCACTCAACG 360 

45 

AAGCCAGCAT CCGCAGTCTG ATCGAACCAT GGTGGCAGAT ATTGTAATAA ATGGTGTTCC 420 
CTTTGAACGT TTTCGTGGAG ACGGGCTAAC AGTTTCGACA CCGACTGGTA GTACTGCCTA 480 
50 TAACAAGTCT CTTGGCGGTG CTGTTTTACA CCCTACCATT GAAGCTTTGC AATTAACGGA 540 
GATTGCCAGC CTTAATAATC GTGTCTATCG AACATTGGGC TCTTCCATTA TTGTGCCTAA 600 
GAAGGATAAG ATTGAACTTA TTCCAACAAG AAACGATTAT CATACTATTT CGGTTGACAA 660 

55 

TAGCGTTTAT TCTTTCCGTA ATATTGAGCG TATTGAGTAT CAAATCGACC ATCATAAGAT 720 
TCACTTTGTC GCGACTCCTA GCCATACCAG TTTCTGGAAC CGTGTTAAGG ATGCCTTTAT 780 
60 CGGTGAGGTG GATGAATGAG GTTTGAATTT ATCGCAGATG AACATGTCAA GGTTAAGACC 840 
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TTTTTAAAAA AGCACGAGGT 


TTCTAAGGGA 


1 TGCTGGCCA 


TV /™ TV mm 7v » / •tin ii 

AGATTAAGTT 


TCGAGGTGGA 


900 




GCTATTCTGG 


TCAATAATCA ACCGCAAAAT 




IJ\L rGGACGT 


T G GAGACTAC 


OCA 

9o0 


5 


GTTACCATTG 


ACATTCCCGC 


TGAGAAAGGC 


1 Tl bAAACCT 


T GGAG GC TAT 


TGAGCTTCC-A 


1020 




TTAGATATTC 


TCTATGAGGA 


TGACCACTTT 


CTAGT CTT GA 


iv m iv fv iv r*f*r*ir\ii 

ATAAACCCTA 


T GGAGT GG CT 


1080 


10 


TCTATTCCTA 


GTGTTAATCA 


CTCTAATACC 


ATTGCCAATT 


TTATCAAGGG 


TTACTATGTC 


1140 


AAGCAAAATT 


ATGAAAATCA 


GCAGGTTCAC 


ATTGTTACCA 


GACTAGATAG 


GGACACTTCT 


1200 




GGCTTGATGC 


TCTTTGCCAA 


GCACGGTTAT 


GCCCATGCAC 


GATTAGACAA 


GCAGTTGCAG 


1260 




AAGAAATCTA 


TCGAGAAACG 


CTACTTTGCT 


TTGGTTAAGG 


GAGATGGACA 


TTTGGAGCCA 


1320 




GAAGGGGAAA 


TTATTGCTCC 


GATTGCGCGT 


GATGAAGATT 


CCATTATTAC 


CAGACGAGTG 


. 1380 


20 


GCTAAAGGCG 


GAAAGTATGC 


CCATACTTCA TACAAGATTG 


TAGCTTCTTA 


TGGAAATATT 


1440 


CACTTGGTCT 


ATATTCACCT 


GCACACTGGT 


CGAACCCATC 


AAATCCGAGT 


CCATTTTTCT 


1500 




CATATCGGTT 


TTCCTTTGCT 


GGGAGATGAT 


TTGTATGGTG 


GTAGTCTGGA AGATGGTATT 


1560 


2S 

-J 


CAACGTCAGG 


CTCTGCATTG 


CCATTACCTA TCCTTTTATC 


ATCCATTTTT 


AGAGCAAGAC 


1620 




TTGCAGTTAG 


AAAGTCCCTT 


GCCGGATGAT 


TTCAGTAACC 


TTATTACCCA 


GTTATCAACT 


1680 


30 


AATACTCTAT 


AAAAACTGTC 


TCAGAGTATA ATTATTATCT 


TAAAGGAGAA AACTCATGGA 


1740 


AGTTTTTGAA AGTCTCAAAG 


CCAACCTTGT 


TGGTAAAAAT 


GCTCGTATCG 


TTCTCCCTGA 


1800 




AGGGGAAGAG 


CCTCGTATTC 


TTCAAGCAAC 


AAAACGCTTA 


GTAAAAGAAA 


CAGAAGTGAT 


1860 


35 


TCCTGTTTTG 


CTTGGAAATC 


CTGAAAAAAT 


TAAAATTTAT 


CTTGAAATTG 


AAGGAATCAT 


1920 




GGATGGTTAT 


GAGGTCATCG 


ACCCTCAACA 


TTATCCTCAA 


TTTGAAGAAA 


TGGTTTCTGC 


1980 


40 


CTTGGTGGAG 


CGTCGCAAGG 


GCAAAATGAC 


TGAAGAAGAT 


GTACGCAAGG 


TTTTGGTTGA 


2040 


AGATGTCAAC 


TACTTTGGTG 


TGATGTTGGT 


TTACTTGGGC 


TTGGTTGATG 


GAATGGTGTC 


2100 




AGGAGCGATT 


CACTCAACAG 


CTTCAACAGT 


TCGCCCAGCT 


CTACAAATCA 


TCAAAACTCG 


2160 


45 


TCCAAATGTA 


ACTCGTACTT 


CAGGAGCCTT 


CCTCATGGTT 


CGTGGTACGG 


AACGTTACCT 


2220 




ATTTGGAGAC 


TGTGCCATTA 


ATATCAATCC 


AGATGCAGAA 


GCCTTGGCTG 


AAATTGCCAT 


2280 


50 


CAACTCAGCA ATCACAGCTA AGATGTTTGG 


CATCGAACCT 


AAAATTGCCA 


TGTTGAGCTA 


4.340 


TTCTACTAAA 


GGTTCAGGGT 


TTGGTGAAAG 


CGTTGATAAG 


GTCGTTGAAG 


CAACTAAAAT 


2400 




TGCTCACGAC 


TTGCGTCCTG 


ACCTTGAAAT 


CGATGGTGAG 


TTGCAATTTG 


ATGCGGCCTT 


2460 


55 


TGTTCCCGAA ACTGCAGCTC 


TGAAAGCTCC 


GGGAAGTACA 


GTAGCTGGTC 


AAGCAAATGT 


2520 




CTTCATCTTC 


CCAGGTATCG 


AGGCAGGAAA 


TATCGGTTAC 


AAGATGGCTG 


AACGCCTGGG 


2580 


60 


TGGCTTTGCG 


GCTGTAGGAC 


CTGTTTTGCA AGGTTTAAAC AAGCCAGTTA ATGATCTTTC 


2640 


TCGTGGATGT 


AATGCAGATG 


ATGTTTACAA 


GTTGACCCTC 


ATCACAGCAG 


CTCAAGCAGT 


2700 
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TCATCAATAG TGAAAACTAT AAAGT GAT AT ACTATGCTAT ACTGTAGTTA TGAAACTATG 2760 

TACGAAAAGC ACTGCCATTA ATTCCTGAGA ACTAAATTAC TGATTGGTGT CAAAAAGGAA 2820 

AACTTCCAAG CGATGATATC CTGTCTATAC ACGACCTATA GAAATCTGTA ATATACATGT 2880 

CCGTAAAACG ATAAATTCCC TTTTTGATTT TAAATGAGTA TGAAAAGAGA ATTTTCCGGC 2940 

10 TCTTTGTCAA CTGTAGTGGG TTGAAAAAAA GCTAAGCTCG AGAAAGGACA AATTTTGTCC 3000 

TTTCTTTTTT GATATTCAGA GCGATAAAAA TCCGTTTTTT GAAGTTTTCA AAGTTTCGAC 3060 

TCTAGAGGAT C 3071 

15 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 720 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

25 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 
35 TTTCCATGGT ATGGTAAAGG TTTTTCTTTT TTTTAAAAGG AAAACGAGAA GAGGAGGTTC 60 
TTATGAAAGC AAGCATTGCC TTGCAAGTTT TACCCCTAGC ACAGGGGATT GATCGGATAG 120 
CTGTTATTGA TCAGGTCATT GCTTATCTGC AAACTCAAGA AGTGACGATG GTAGTGACAC 180 

40 

CATTTGAAAC GGTCTTGGAA GGGGAGTTTG ATGAGCTTAT GCGCATTCTA AAAGAAGCGC 240 
TGGAAGTGGC AGGGCAGGAG GCAGACAATG TCTTTGCCAA TGTCAAAATA AATGTAGGAG 300 
45 AGATTTTAAG TATTGATGAG AAACTTGAAA AGTATACTGA GACGACACAT TAGTCTATTG 360 
GGCTTTCTCG GAGTATTGTC AATCTGGCAG TTAGCAGGTT TTCTTAAACT TCTCCCCAAG 420 
TTTATCCTGC CGACACCTCT TGAAATTCTC CAGCCCTTTG TTCGTGACAG AGAATTTCTC 480 

50 

TGGCACCATA GCTGGGCGAC CTTGAGAGTG GCTTTACTGG GGCTGATTTT GGGAGTTTTG 540 
ATTGCCTGTC TTATGGCTGT GCTCATGGAT AGTTTGACTT GGCTCAATGA CCTGATTTAC 600 
55 CCTATGATGG TGGTCATTCA GACCATTCCG ACCATTGCCA TAGCTCCTAT CCTGGTCTTG 660 
TGGCTGGGTT ATGGGATTTT TGCCCAAGAT TGTCTTGATT ATCTTAACAA CAACCTTTCC 720 

60 (2) INFORMATION FOR SEQ ID NO: 16: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 852 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GCCGTCATAA TCATGCGCCG AATCCGTCCC CATTAAAATC TGGGTCTGTA AAGACAATGA 60 

20 CTCCATGACG TTGGTGTAGA CGCTGAATCC GCTCTATGTC CTGGTCATTG ATGGCAGAAC 120 

CTCGAGTCTC ATAGGTCTCC ACATCGAAAT AACGCTTGAG ATTGACCGTA TCATCACGAC 180 

CTTCAACCAC GATAACTTGG GAAATTCTCT CTTTCATTAC TTGCTGTCCA ATCCCAAAAA 240 

25 

TGCGTTCTGC ATTTGCAGTC GTTGCTACCG CCAGCTCTTC TGTCGTCATA CCACGCAAGT 300 

CAGCGATAAA GTCGACCACA TAGCGAGTAT AGGCTGTTTT ATTTTCACGA CCACGCTTGG 360 

3 0 GTACAGGTGC TAAGTAAGGC GCATCTGTTT CTACCAACAT CTTGTCCAAA GGTAACTCTT 420 

TAGCTGCTTC TTGGAGGTCA GTTGCCTTCT TGAAGGTCAC CACTCCTGAG AAGGAAATGG 480 

TCATACCAAG ATCCCGGTAC CGAGCCCACT CAAGCGTCCC TGAAAATGAA TGCATGATAC 540 

35 

CACCACGAGG ACCAACGCCC TCACTCTTGA TAATCTCATA GGTATCTTCC AGC GCATCAC 600 

GGGTATGGAC AACAAAAGGC AAATCCAAGT CCTTAGATAG CTGAATCTGA CGGCGAAAAA 660 

40 CCTGCTCCTG CACCTCTTGG GCGCTGTCAT CCAATGGTAG TCTAAGCCAA TTTCACCTAA 720 

AGCCACAACC TTGGAATGTT TTAACTTATC CAACAAGTAA GCCTCAACTT CCTCTGTATA 780 

AGTACCAGCT TCTGTAGGAT GCCAACCAAT AGTCGCATAG AGCTGCTCAT ACTCATCTAC 840 

45 

CAAACTCCAA GG 852 
(2) INFORMATION FOR SEQ ID NO: 17: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 868 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

55 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
60 (iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

5 

GGGGATCCTC TAGAGTCGAT ATCTACGGTC TCAACCGTAC AGGACTGTTG AACGATGTAC 60 

TGCAAGTTCT TTCAAATACA ACCAAGAATA TTTCAACGGT CAATGCCCAA CCAACCAAGG 120 

10 ATATGAAGTT TGCTAATATC CATGTGTCCT TCGGTATTGC CAACCTCTCT ACACTGACCA 180 

CGGTTGTCGA TAAAATTAAG AGTGTGCCAG AAGTTTACTC TGTCAAACGG ACCAACGGCT 240 

AGTTGTCCTA GCTCTTACTA GAAAGGCTAT TATGAAAATC ATTATCCAAC GGGTTAAAAA 300 

15 

AGCCCAAGTG AGTATAGAAG GCCAGATTCA GGGAAAAATC AATCAGGGAC TTTTATTGCT 360 

GGTTGGTGTT GGACCAGAGG ACCAAGAGGA AGATTTGGAC TATGCTGTGA GAAAACTGGT 420 

20 CAATATGCGG ATTTTTTCAG ACGCAGAAGG CAAGATGAAC CTGTCTGTCA AAGATATTGA 480 

AGGAGAAATC CTCTCTATTT CTCAGTTTAC CCTCTTTGCG GATACTAAGA AAGGCAATCG 540 

TCCAGCCTTT ACAGGGGCAG CTAAACCTGA TATGGCATCA GACTTCTATG ATGCTTTCAA 600 

25 

TCAAAAATTA GCGCAAGAAG TGCCCGTTCA GACAGGTATC TTTGGAGCAG ATATGCAGGT 660 

TGAGCTGGTT AATAACGGAC CTGTTACCAT TATCCTAGAT ACTAAAAAGA GATAAGAAAG 720 

30 ACCAAGCCCA GTCGGCTTGG TCTTTCTCAT C GAT CAT AAA AATACTCCAA AAAGAAATCG 780 

GTTCTTGATA TGCTTGGGGG ACTCTTTTCA GGCTTTGGCA GATGCGATAG GAAGGGATGA 840 

GATGTCCTAG GGTGAGGAGA GTTCCCTG 868 
(2) INFORMATION FOR SEQ ID NO: 18: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1399 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



45- 



50 



60 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

55 CGGTCCTCGT CCGATTGACT CACACCTTAA GGCGTTTGAA GCTATGGGTG CCACTGCTAG 60 

CTACGAGGGA GATAACATGA AGTTATCTGC TAAAGATACA GGACTTCATG GTGCAAGTAT 120 

TTACATGGAT ACGGTTAGTG TGGGAGCAAC GATTAATACG ATGATTGCTG CGGTTAAAGC 180 

AAATGGTCGT ACTATTATTG AAAATGCAGC CCGTGAACCT GAGATTATTG ATGTAGCTAC 240 
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TCTCTTGAAT AATATGGGTG CCCATATCCG TGGGGCAGGA ACTAATATCA TCATTATTGA 300 

TGGTGTTGAA AGATTACATG GGACACGTCA TCAGGTGATT CCAGACCGCA TTGAAGCTGG 360 

5 

AACATATATA TCTTTAGCTG CTGCAGTTGG TAAAGGAATT CGTATAAATA ATGTTCTTTA 420 

CGAACACCTG GAAGGGTTTG TTGCTAAGTT GGAAGAAATG GGAGTGAGAA TGACTGTATC 480 

10 TGAAGACAGC ATTTTTGTCG AGGAACAGTC TAATTTGAAA GCAATCAATA TTAAGACAGC 540 

TCCTTACCCA GGCTTTGCAA CTGATTTGCA ACAACCGCTT ACCCCTCTTT TACTAAGAGC 600 

^ GAATGGTCGT GGTACAATTG TCGAGTCGAT ACGATTTACG AAAAACGTGT AAATCATGTT 660 

TTTGAACTAG CAAAGATGGA TGCGGATATT TCGACAACAA ATGGTCATAT TTTGTACACG 720 

GGTGGACGTG ATTTACGTGG GGCCAGTGTT AAAGCGACCG ACTTAAGAGC TGGGGCTGCA 780 

20 CTAGTCATTG CTGGGCTTAT GGCTGAAGGC AAAACTGAAA TTACCAATAT CGAGTTTATC 840 

TTACGTGGTT ATTCTGATAT TAT CGAAAAA TTACGTAATT TAGGAGCGGA TATTAGACTT 900 

GTTGAGGATT AAACCGTAGA GGTGTTTATG AATATTTGGA CCAAATTAGC AATGTTTTCT 960 

25 

TTTTTTGAAA CGGATCGCTT GTATTTGCGT CCTTTCTTTT TTAGTGATAG TCAGGACTTC 1020 

CGCGAGATAG CTTCAAATCC AGAAAATCTT CAATTTATTT TCCCAACGCA GGCAAGTCTG 1080 

30 GAAGAAAGTC AATATGCACT GGCCAATTAC TTTATGAAGT CCCCTTTGGG AGTGTGGGCA 1140 

ATTTGTGACC AGAAAAATCA ACAAATGATT GGTTCTATTA AATTTGAGAA GTTAGATGAA 1200 

^ ATCAAAAAAG AAGCTGAGCT TGGCTATTTT TTGAGAAAAG ATGCTTGGTC GCAAGGATTT 1260 

ATGACAGAGG TTGTTAGAAA AATTTGTCAG CTTTCTTTTG AGGAATTTGG CTTAAAACAA 1320 

TTATCTATCA TTACCCACCT TGAAAATGAA GCTAGCCAAA GAGTTGCTCT TAAGTCTGGA 1380 

40 TTTAGTTTGT TCCGTCAGT 1399 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 1779 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
AGATTGCTCT TGAACACGAT GAAATACCAA TTGGTTGTGT GATTGTCAAA GATGGGAAAA 
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TCATTGGTCG TGGGCATAAT GCGCGTGAGG 
TATGGCTATA GAGGATGCGA ACTTGAGTGC 

5 

TTTTGTGACC ATTGAACCAT GTGTCATGTG 
AAATGTGGTC TATGGGGCTA AAAACCAGAA 
10 CTTGACAGAT GAGCGTCTTA ACCATCGTGT 
ATGTGCAGCT ATCATGCAGG ACTTTTTTAG 
AAATGAATAG GAATGTGATA TAATAAATAG 

15 

GGGGAGGAAT CCCAGCAGCC CTAAGCGATT 
TTCCGAATAA ATAAGATAGA ATAATCTAGA 
20 AAAATTCGTG GTTTTGAATT GGTTTCGAGT 
GAGACAGCGC ATGCGGCTGG TTACGACTTA 
GGAGAGATTG TCTTGGTTCC GACAGGGGTT 

25 

TACCTCTATG ATCGTTCTTC AAATCCTCGT 
GGGGTCATTG ATGGGGATTA TTATGGAAAT 
3 0 ATGAAGAATA TCACAGACCA AGAGGTTGTT 
GTTTTTGCTA CTTTCTTAAT TGCAGATGGA 
TTTGGATCGA CAGGGCACTA GAATGAAGAT 

35 

CCGTGAGTTA GAGGAGCGTT CTTATATAGG 
GATGGGACGG CAGCAAGTCC AGAAATTGAG 
40 AATCGTATCT TCTGCAGTCA CAAGAGCTTT 
GGGTCTTCCT TTAAGAGTAG AGCCTTTATT 
AGAAAACTTT GAAACAGCTA GAAGACTGTT 

45 

TAGTCCTATT CAATATGAGA CAGCTACGGA 
TAAGTATCGA GAACATCAGA CTGTGGTAGT 
50 TGTGCCAAAT GAGAAGATTG ATTTTTGCCA 
AGAGGTTTGT CATCGCAAAG AAAAAAGCGA 
CCCCTAAATA TCTGGGACGT TGCCCCAACT 

55 

TTGAGGTTGC CGAAGTTAAG AATGCGCGTG 
TGAAACTAGC TGAGGTGACT TCCATCAATG 
60 (2) INFORMATION FOR SEQ ID NO: 20 



AATTACAGCG ACGGTTATGC AT GCGGAAAT 120 

AGGAGACTGG CGCTTGCTGG ATTGCACACT 180 

TAGTGGGGCG ATTGGGCTTG CCCGTATTCC 240 

ATTTGGCGCT GCTGGAAGTT TGTACGATAT 300 

AGAGGTTGAA ACGGGAATTT TGGAAGATGA 360 

AAATAGACGG AAAAAATAAT TTTGCTTTTA 420 

TGGAGCAACA GTTCTGCGTG AAGCGGGTCA 480 

TGAATTGTGT GCTCTTTTTT TCGTGCTTTT 540 

ATAAATGATA ATAGAAAAGA GAAAATTATG 600 

TTTACAGATG AAAATTTATT GCCCAAGCGT 660 

AAGGTTGCTG TGCGTACAGT TGTTGCGCCA 720 

AAGGCTTATA TGCAGCCGAC TGAGGTTCTC 780 

AAGAAGGGCT TGGTTTTAAT TAACTCAGTT 840 

CCTGGAAATG AAGGGCATAT TTTTGCGCAG 900 

CTTGAAGTTG GGGAGCGTAT TGTCCAGGCT 960 

GATGCAGCTG ATGGCGTTCG AACTGGTGGA 1020 

TATCTTTGTA CGTCATGGGG AGCCAGATTA 1080 

ATTTGGGATA GATTTGGCAC CCTTGTCTGA 1140 

CAAAAATCCT TTACTCTCGT CAGCTGAAAT 1200 

AGAAACGGCT TCGTATGTGG TCTGTGCTAC 1260 

ACATGAATGG CAGGTCTATA AAACAGGAAT 1320 

TTTAGAAAAC AAGGGGGAGT TGCTTCCTAA 1380 

AATGAAGTCT CGGTTTCTAG AATGTATGTC 1440 

TGTTGCTCAT CGACTCTAGA GGAGCCAGTT 1500 

AGTGATTGAG TGTGAGTTAG AGATATAGAA 1560 

CATTTGTATG TCAAAATTGT GGGTATAATT 1620 

GTGGGTCTTG GTCTTCTTTT GTGGAAGAGG 1680 

TGTCCTTGAC AGGTGAGAAA ACCAAGCCCA 1740 

TCAATCGAC 1779 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3725 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
<iv) ANTI-SENSE: NO 



15 

(xi) SEQUENCE DESCRIPTION: SI 
GCGGATCCTC TAGAGTCGAA AGATTACGAA 
20 CATGGTACAA CAAGCTGCAA CAGTATCTCT 
CAATGCTTAC GGTACAGCAG CGATTGGTAT 
AAATATGACT GTTGAGGCAA CTCAACGCTT 

25 

CCAACAGCAA TTTGCAATCT GGTTTGTAGA 
AGAAAGTTTA GACAATCTTA AATTACCTAA 
30 TGCATCTGCT ACCTTGATGC TCGTATTCTT 
CATTATGTCT AATAAAGAAG TCATCACTTC 
TTTCTTTATG TACATTATCC AAACAGCCTT 

35 

GCAAGGTGTC CGAATGTTCG TATCTGAGTT 
ATTGTTGCCA GGTTCATTCC CAGCGGTTGA 
40 AAATGCTGTC TTGTCAGGAT TTACCTTTGG 
GCTCATCGTC TTTAAAAATC CGATTCTTAT 
CAATGCAGCC ATTGCGGTCT ACGCTGATAA 

45 

TTCCTTTATA TCAGGTGTCC TTCAAGTTGC 
TTTGGCATCT TATGGTGGCT ACCATGGAAA 
50 TGGATATATC TTCAAATACC TTGGTATTGT 
TGTTATTCCT CAACTTCAAT TTGCCAAAGC 
AGTTCAAGAA GAAGCTTAGT ATCTAGAAAA 

55 

CGTGCGGAAA TGGAATGGGT TCATCAATGG 
GTAAGCTTAA TCAAACAGAT TTTACAGTCA 
60 TAGCAGTAGG ATATGACATC GTAATCGCTT 



:Q ID NO:20: 

GGTAAGAACC CTCTTTATTA CTGGTCACAT 60 

TATGGTTCTA TTCTTAGTAC CACAATTGCG 120 

CATCTGTGGA CTTTACTGGG CAGTTAGTTC 180 

GACTGGTGGT GGCGGATTTG CGATTGGTCA 240 

TAAAGTAGCA GGACGCTTTG GTAAGAAAGA 300 

GTTCCTCTCA ATCTTCCACG ATACAGTTGT 360 

CGGGGCCATT CTTTTAATCT TGGGTCCAGA 420 

AGGAACTCTA TTCAATCCTG CTAAACAAGA 480 

TACCTTCTCA GTTTACTTGT TCGTTTTGAT 540 

AACAAACGCT TTCCAAGGTA TTTCAAACAA 600 

CGTTGCAGCT TCTTATGGAT TTGGTTCTCC 660 

TTTGATTGGT CAATTGATTA CAATTGTCTT 720 

TATTACAGGA TTTGTACCAG TGTTCTTTGA 780 

ACGCGGCGGA TGGAAAGCGG CTGTTATCCT 840 

TCTAGGAGCT CTTTGTGTGG CCCTTCTCGA 900 

TATCGACTTT GAATTCCCAT GGCTTGGATT 960 

TGGTTATGTA CTTGTGTGTC TCTTCTTGCT 1020 

AAAAGATAAA GAGAAATATT ACAACGGTGA 1080 

GGAGAAATAA AATGGTTAAA GTATTAGCAG 1140 

TTATCAAGAT GAAGGTTGAA AATGCTCTCC 1200 

ATTCATGCAG TGTCGGTGAA GCTAAAGGTT 1260 

CTCTTCATTT GATTCAAGAA TTGGAAGGGC 1320 
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GAACTAATGG GAAGTTAATT GGACTTGATA ACTTGATGGA TGATAAAGAA ATCACCGAAA 1380 

AACTCAGTCA AGCACTACAG TAAAAGGTTG GAGGGGGCTG GACAGAAACT GAGAGTTATC 1440 

GTTTCTGTCC TTCTCCCTCT TTAAATAAAG GAGGCAGATA TGAATTTAAA ACAAGCTTTA 1500 

ATTGACAATG ACTCGATCCG ACTAGGTTTA GAGGCTAACA ATTGGAAAGA AGCAGTCAAG 1560 

GTAGCAGTAG ATCCCTTAAT TGAAAGTGGG GCAATTTTGC CAGAGTATTA CGATGCTATC 1620 

ATTGAATCGA CTGAAGAGTA TGGGCCTTAC TATATCTTGA TGCCAGGTAT GGCTATGCCC 1680 

CACGCTAGAC CTGAAGCTGG TGTGCAAAGT GATGCCTTTT CATTGATTAC CTTACAAAAT 1740 

15 CCTGTTGTAT TTTCAGATGG GAAAGAGGTA TCTGTTTTGT TGGCACTAGC AGCAACAAGC 1800 

TCAAAAATTC ACACAAGTGT AGCCATTCCA CAAATTATTG CCCTGTTTGA ATTAGAAGAT 1860 

TCTATTGCAC GTTTACAGGC TTGCCAGACT AAAGAAGATG TCTTGGCTAT GATTGAAGAA 1920 

20 

TCTAAGGATA GCCCTTATCT CGAAGGATTG GATTTGGAAA GTTAGAAAGA GGAATAAAGA 1980 

AATGACAAAA AGAATACCTA ATTTACAAGT TGCATTAGAC CATTCAGACT TGCAAGGAGC 2040 

25 GATTAAAGCA GCTGTTTCTG TTGGTCAGGA AGTAGATATT ATCGAAGCTG GAACTGTTTG 2100 

CTTGCTTCAA GTTGGAAGTG AACTGGCTGA AGTCTTGCGT AGCCTTTTCC CAGATAAGAT 2160 

TATTGTGGCA GACACAAAAT GTGCTGATGC TGGTGGAACA GTTGCTAAAA ATAATGCGGT 2220 

30 

TCGTGGAGCA GACTGGATGA CTTGTATCTG TTGTGCAACC ATCCCTACTA TGGAAGCAGC 2280 

TCTAAAGGCT ATCAAGACTG AACGAGGAGA ACGAGGCGAA ATCCAGATCG AGCTTTATGG 2340 

35 CGATTGGACT TTTGAACAAG CTCAGCTTTG GCTAGATGCA GGTATTTCAC AAGCTATTTA 2400 

TCACCAATCT CGTGATGCTC TTCTTGCTGG TGAAACTTGG GGTGAAAAAG ACCTTAATAA 2460 

GGTTAAAAAA CTCATTGACA TGGGCTTCCG TGTATCTGTA ACAGGTGGTC TAGATGTAGA 2520 

40 

TACTCTCAAA CTCTTTGAAG GTGTTGATGT CTTTACCTTT ATCGCAGGTC GTGGAATTAC 2580 

AGAGGCTGCG GATCCAGCAG GAGCAGCGCG TGCCTTCAAG GATGAAATCA AACGAATTTG 2640 

45 GGGGTAAATC ATGGTACGTC CAATTGGAAT TTATGAAAAG GCAACCCCAA CACACTTTAC 2700 

TTGGCTAGAA CGTTTAAATT TTGCCAAGGA GTTAGGCTTT GATTTTGTCG AGATGTCTAT 2760 

TGACGAACGT GACGAGCGTT TAGCAAGACT TGACTGGAGT AAGGAAGAAC GCTTGGAAGT 2820 

50 

TGTCAAAGCA ATCTATGAAA CTGGTGTTCG TATTCCTTCT ATCTGTTTTT CAGGCCATCG 2880 

TCGCTACCCA TTGGGTTCAA AAGATCCAGT TCTAGAGGAA AAATCTCTAG AACTCATGAA 2940 

55 AAAATGTATC GAATTAGCTC AAGACTTGGG AGTTCGTACG ATTCAATTAG CTGGTTACGA 3000 

TGTTTACTAT GAGGAAAAGT CACCCCAGAC ACGCCAACGT TTTATCAAAA ATTTGAGAAA 3060 

AGCCTGTGAC TGGGCTGAAG AAGCTCAGGT GGTACTTGCT ATTGAAATTA TGGATGATCC 3120 

TTTCATCAAT AGCATCGAAA AATATTTGGC TATAGAAAAA GAGATTGACT CTCCCTTCCT 3180 
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CTTTGTATAT 


CCAGATATTG 


GTAATGTGTC 


TGCATGGCAT 


AATGATATCT 


ATAGTGAGTT 


3240 


TTATCTTGGT 


CATCATGCCA 


TCGCAGCTCT 


CCATCTCAAG 


GATACTTATG 


CAGTGACAGA 


3300 


AAGTTCAAAG 


GGCCAGTTCC 


GAGATGTACC 


TTTCGGGCAA 


GGTTGTGTCA 


AATGGGAAGA 


3360 


AGCTTTCGAT 


ATTTTAAAGG 


AAACCAATTA 


TAATGGACCT 


TTCCTAATCG 


AAATGTGGTC 


3420 


TGAAAATTGT 


GAAACAGTAG 


AAGAAACACG 


CGCAGCCGTT 


CAAGAGGCGC 


AAGCTTTTCT 


3480 


CTATCCACTC 


ATTAAGAAAG 


CAGGTTTGAT 


GTAAGATGAA 


TCAAGTAATC 


AATGCTATGC 


3540 


GTAAACGAGT 


CTGTGATGCC 


AATCAATCAT 


TGCCAAAACA 


TGGACTTGTC 


AAATTTACCT 


3600 


GGGGGAATGT 


ATCTGAAGTT 


AATCGCGAAC 


TCGGTGTCAT 


TGTTATCAAA 


CCATCAGGCG 


3660 


TGGATTATGA 


CGAATTGACA 


CCTGAAAACA 


TGGTAGTGAC 


TGATCTAGAT 


GGTAAGATCC 


3720 


CCGGT 












3725 



(2) INFORMATION FOR SEQ ID NO:21: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2483 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 

TCTAGAATCA TTTCCCAGCA GTTGGCTCAG GAAGTCGCAA TTATCTGGGT GAGTTTTCAG 60 

CGTGTTGGAC TGAAGTGGAG ATTGATGAGA TATACCGCGC CTTTGTCATG GCACATTTCA 120 

AGAGTTCGCG TCCAGATGCC CAGACCTTGA TTTTCTATAC CCACTATGAC ACTGTGCCAG 180 

CGGATGGGGA TCAGGTCTGG ACAGAGGATC CTTTTACGCT TTCGGTCCGC AATGGCTCAT 240 

GTATGGGCGT GGGGTTGATG ACGACAGGGT CATATCACAG CTCGCTTGAG TGCTTGAGAA 300 

AATATATGCA GCCCTGATGA TTACCTGTCA ATATCAGCTT TATCATGGAG GGAGCGGAGG 360 

AATCGGCTTC AACAGACCTA GATAAGTATT TGGAAAAGCA TGCAGACAAA CTCCGTGGGG 420 

CGGATTTGTT GGTCTGGGAA CAAGGGACCA AAAATGCCTT GGAACAGCTG GAAATTTCTG 480 

GTGGCAATAA GGGGATTGTG ACCTTTGATG CCAAGGTAAA AAGCGCTGAT GTGGATATCC 540 

ACTCGAGTTA TGGTGGTGTT GTGGAATCAG CTCCTTGGTA TCTCCTCCAA GCCTTACAGT 600 

CTCTTCGTGC TGCGGATGGC CGTATCTTGG TTGAAGGCTT GTACGAAGAA GTACAAGAGC 660 
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CCAATGAACG AGAAATGGCC TTGCTAGAAA 
GTCGGATTTA TGGATTGGAG TTGCCTCTCT 

5 

GTTTCTTTTT CGAGCCAGCG CTTAATATCG 
GTGTTAAGAC TATTTTGCCT GCAGAAGCCA 
10 GCCTAGAACC GCATGATGTT CTGGAAAAAA 
ATAAGGTAGA ATTATACTAT ACCTTGGGAG 
CCAGCCATTC TCAATGTGAT CGAGTTGGCC 

15 

TTGCCGACGA CAGCGGGGAC AGGACCTATG 
ATGGTTGCAT TCGGTCTAGG AAATGCCAAT 
20 CGAATCGCTG ATTATTACAC CCATATCGAA 
TAGAGATATT ATCAAGTTAG ATCAGATCGA 
CACAGCGGTT AAGGATGTGA CCATTCACAT 

25 

ATATTCTGGA GCAGGGAAAT CAACCCTTGT 
TGCAGGGAAA ATTACCATTG ACGACGATGT 
30 AGAGCAGTTG CGTCGTAAAC GTCAAGATAT 
GAGCCAAAAG ACAGCAGAGG AGAATGTAGC 
GGAAGAAAAG AAGGCTAAAG TAGCTAAGTT 

35 

TGAAAACTAC CCTTCACAAC TATCTGGAGG 
CTTGGCCAAT GATCCAAAAA TCTTGATTTC 
40 GACAACCAAG CAGATTTTGG CCTTGTTGCA 
TGTCTTGATT ACGCATGAAA TGAGATTGTC 
CAGGATGGGC ATTTGATTGA AGAGGGAAGT 

45 

CCTTTGACTC AAGACTTTAT CTCAACAGCC 
GAGAAGCAAG AAATCGTGGA ACACTTGTCT 
50 CGCTGGAGCT TCAACAGACG AGCCACTTTT 
GGCTAATATT CTCTATGGGA ATATCGAAAT 
TGGTGGTCTT GTCAGGTGAA AAAGCAGCGT 

55 

CAGGTGTACA ACTAAAAGTA TTGAAGGGAG 
TATTTACCAA ATGTCTATAA AATGGGTTGG 
60 ACTTAACTCT TTATATGCAG TTCTTTCCTT 



CTTATGGTCA ACGAAACCCA GAGGAAGTTA 720 

TACAGGAGGA GCGGATGGCC TTTCTAAAAC 780 

AAGGAATCCA GTCTGGTTAT CAAGGTCAGG 840 

GTGCCAAGCT AGAGGTTCGT CTGGTTCCGG 900 

TTCGGAAACA GCTAGACAAA AATGGCTTTG 960 

AGATACTAGA GTCGAAGCGA TATGAGCGCA 1020 

AAGAAATTCT ATCCACAGGG CGTTTCAGTC 1080 

CATACGGTCT TTGATGCCCT AGAGGTACCA 1140 

AGCCGAGACC ACGGTGGAGA TGAAAATGTG 1200 

TTAGTAGAGG AGCTGATTAG AAGCTATGAG 1260 

TGTGACTTTT CACCAAAAGA AGAGAACCAT 1320 

CCAAGAAGGG GATATCTACG GAATCGTTGG 1380 

ACGGGTGATT AACCTCTTGC AAAAACCATC 1440 

GATTTTTGAC GGCAAGGTGA CCTTGACGGC 1500 

CGGGATGATT TTCCAGCATT TTAACCTGAT 1560 

CTTTGCCCTT AAACACTCTG GACTCAGCAA 1620 

GTTGGACTTG GTTGGTTTGG CAGATCGTGC 1680 

GCAAAAACAG CGTGTGGCAA TTGCGCGTGC 1740 

AGACGAGTCA ACTTCTGCCC TTGACCCTAA 1800 

AGATTTGAAC CAAAAATTAG GATTGACAGT 1860 

AAAGACATTG CCAACCGTGT GGCGGTTATG 1920 

GTCCTTGAAA TCTTCTCAAA CCCTAAACAA 1980 

ACAGGTATTG ACGAAGCCAT GGTCAAAATC 2040 

GAAAACAGTC TCTTGGTGCA ACTTCAAGTA 2100 

GAATGAATTG TACAAGCATT ACCAAGTAAT 2160 

TCTCGATGGT ACTCCTGTTG GAGGAATTGG 2220 

TGGCAGGTGC CCAAGAAGCC ATTCGTCAAG 2280 

TACAGTAAGA TGGAATCATT GATTCAAACC 2340 

GCTGTCAGGC AGGCTGGGGG ACGGCTATCT 2400 

CATTATTCGG GTTCTTGGGG CTAGTGGCAG 2460 
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GTCTTCTCGT CTTAAGCGCC AGT 



2483 



10 



15 



20 



25 



30 



35 



40 



45 



50 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1010 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

CCAATTAATG TGAGTTAGCT CACTCATTAG GCACCCCAGG CTTTACACTT TATGCTTCCG 60 

GCTCGTATGT TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAAA CAGCTATGAC 120 

CATGATTACG CCAAGCTTGC ATGCCTGCAG GTCGACTCTA GAGGATCCAA GCCATAGTTA 180 

GACATGACTG CCAAATCTAA GGTTTGAGCA GTTGTTAAAT AAGCATTAGC TGTCGCCTCC 240 

ATGTTGGGAC TGGTTACTTT GAGGCCTACT AAGGCTAGAG ATCCCAACAT CATCAGGATC 300 

AAGATGGATA AAAAACGCCC TTGGAGCCTG TGAAGGACTG AATTAAGTCC TTCCGAATAA 360 

GTTTTTCGCT TGATCATGCT AGTACTCCAA ACTGTCAATA TCCTGAGGAT GCTGGTTGAG 420 

CACCACATCC TTGACACTGG CATCGTGCAT TTGAATCACG CGATCAGCAA TGGGCGCCAA 480 

AGCTCCATTA TGAGTCACGA TGATCACCGT CGCTCCCTTT TGACGAGACA TGTCTTGGAG 540 

AATTTTCAAA ACCTGCTTGC CCGTCTGATA ATCCAAGGCT CCAGTCGGTT CATCACAAAG 600 

GAGAATTTTA GGATTTTTGG CTACCGCGCG TGCAATGGAG ACTCGCTGTT GCTCCCCTCC 660 

AGAAAGCTGG GCTGGAAAGT TATTTAGACG ATGAGCCAGA CCTACATCTG TCAAGACCTG 720 

ATCAGAATTC AAGGCATCTG TCACAATTTC AGAAGCAGTT CCACATTTTC CTTAGCTGTC 780 

AGATTAGAAA CTAGATTATA AAACTGAAAA ACAAACCCCA CATCATTTCT ACGGTAATTG 840 

GTGCGCTGGT GGGAACTATA ATCCGCAATA TTAACACCAT CAATCCAGAT TTCCCCTTCA 900 

TCATTGGTAT CCATTCCCCC AAGAAGGTTA AGAACTGTTG ACTTGCCTGC ACCTGAAGCA 960 

CCAAGGATAA TAACCAGTTC CCCCTTTTCA ATCTCAAAAT TCACATCACG 1010 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1299 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

TCGATCGCAC CGTCCTCTCC TCGTTCTGCT CTTGCTGGGC TATAGTTCCC TCTTCTAGTC 60 

TTGATTTTCT TTGCCCATGC GTTCTTACCA CTTCTACTGT TTGCAGGTTT TACATGTCTG 120 

GATATACTAT TTGTGCTAGG CTTAGCTTCT AGGATGGAGA AAAGAAGTCT AGTAGAGTTA 180 

TTGAAAGGGG GCATCTTATG ATTGAGTTGA AAAATATTAC CAAAACCATT GGGGGAAAAG 240 

TGATTTTGGA TAACTTATCT CTCAGGATTG ATCAGGGGGA TTTGGTAGCT ATTGTTGGTA 300 

AGAGTGGTAG TGGGAAGTCG ACCTTGTTAA ATTTATTGGG TTTGATAGAT GGTGATTATA 360 

GCGGACGGTA TGAGATTTTT GGTCAGACAA ATCTAGCGGT TAATTCTGCT AAGTCGCAAA 420 

CAATAATCCG TGAACATATC TCTTATCTGT TTCAAAATTT TGCCCTGATT GATGATGAAA 480 

CGGTCGAGTA CAATCTCATG CTGGCGCTGA AATATGTGAA ATTGCCTAAG AAAGACAAGC 540 

TCAAAAAGGT GGAAGAGATT TTAGAGAGAG TAGGTTTGTC AGCTACTTTG CATCAAAGGG 600 

TCTCCGAGTT GTCTGGGGGC GAACAACAAC GAATTGCAGT TGCTAGAGCC ATCTTAAAAC 660 

CCAGCCAGCT GATTTTAGCC GATGAACCTA CAGGTTCGCT GGATCCTGAA AATAGAGATT 720 

TGGTCTTGAA GTTTCTCTTA GAGATGAATC GAGAAGGGAA AACAGTCATT ATTGTGACCC 780 

ACGATGCTTA TGTAGCCCAA CAATGTCATC GTGTCATTGA ATTGGGCGAG GGAAAATGAG 840 

TTCATTCAGC TCCTTTTGAC TGGCTGAATA CTCATGTTTT CCAGAGAAAA ATAGCATAAA 900 

TACGCCTAGG AATGACATTT TATGTAGCAT TTCTAGGTTT TTTTGTTTCA AATTGAAAAT 960 

TTTTTCAATT TAGGCTTGAC AAAGGATGAG TATAGGAGTA TTATTTATAC AATAAAAAAG 1020 

AATAAACATA AAGAAGGCTT TGTTATGAAT AAGATGAAGA AGGTGTTGAT GACGATGTTT 1080 

GGTTTAGTGA TGCTCCCCCT ACTATTTGCT TGTAGTAACA ATCAATCGGC TGGAATTGAA 1140 

GCCATCAAGT CCAAAGGAAA ATTGGTTGTA GCCCTCAATC CAGATTTTGC TCCATTTGAA 1200 

TATCAAAAAG TGGTTGATGG GAAAAATCAG ATTGTGGGTT CAGATATCGA CTTAGCCAAG 1260 

CTATCGCAAC AGAACTAGGT GTCGACTCTA GAGGATCCC 1299 
(2) INFORMATION FOR SEQ ID NO: 24: 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 252 base pairs 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 913 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

10 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 
AAACACTGCT TCTTGAGCGA ATGACGCTTT GTCCTTTTAA TGAGGTTACC AACGGCTTCA 60 

20 

AAGAGGATTC CCAGCTCGTT CAGCTGTGGA GGTAGCTCGT CTTCCTCGTG ATGTAAAAGT 120 
CGAAATTGAA GTCATCGCAG AGATTGGATA AGCTAGTTGA AGTTTGGTGT TGCCAAACTT 180 
25 CTTTTGATAT AAGGAGAAAA AGATGACAAA GAAACAACTT CACTTGGTGA TTGTGACAGG 240 
GATGGGTGGC GCAGGGAAAA CTGTAGCCAT TCAGTCCTTC GAGGATCTAG GTTATTTCAC 300 
CATTGATAAT ATGCCGCCAG CTCTCTTGCC TAAGTTTTTG CAGCTGGTTG AAATTAAGGA 360 

30 

AGACAATCCT AAGTTGGCCT TGGTAGTGGA TATGCGTAGT CGTTCTTTCT TTTCAGAGAT 420 
TCAAGCTGTT TTGGATGAGT TGGAAAATCA AGATGGTTTG GATTTCAAAA TCCTCTTTTT 480 
35 GGATGCGGCT GATAAGGAAT TGGTCGCTCG TTACAAGGAA ACCAGACGGA GTCACCCACT 540 
AGCAGCAGAC GGTCGTATTT TAGATGGAAT CAAGTTGGAA CGTGAACTCT TGGCACCTTT 600 
GAAAAATATG AGCCAAAATG TGGTGGATAC GACTGAACTC ACTCCACGTG AGCTGCGCAA 660 

40 

AACCCTTGCA GAGCAGTTTT CAGACCAAGA ACAAGCTCAG TCTTTCCGTA TCGAAGTCAT 720 
GTCTTTCGGA TTTAAGTATG GAATCCCGAT TGATGCGGAC TTGGTCTTTG ATGTCCGTTT 780 
45 CTTGCCAAAT CCCTATTATT TACCAGAACT GAGAAACCAA ACGGGTGTGG ATGAACCTGT 840 
TTATGATTAT GTCATGAACC ATCCTGAGTC AGAAGACTTT TATCAACATT TATTGGCCTT 900 
GATTGAGCCG ATT 913 
(2) INFORMATION FOR SEQ ID NO: 27: 



50 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5919 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



60 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TCGATTCGTG GAGCAGGAAA TCTTTTAGGA AAATCCCAGT CTGGTTTCAT TGATTCTGTT 60 

10 

GGTTTTGAAT TGTATTCGCA GTTATTAGAG GAAGCTATTG CTAAACGAAA CGGTAATGCT 120 

AACGCTAACA CAAGAACCAA AGGGAATGCT GAGTTGATTT TGCAAATTGA TGCCTATCTT 180 

15 CCTGATACTT ATATTTCTGA TCAACGACAT AAGATTGAAA TTTACAAGAA AATTCGTCAA 240 

ATTGACAACC GTGTCAATTA TGAAGAGTTA CAAGAGGAGT TGATAGACCG TTTTGGAGAA 300 

TACCCAGATG TAGTAGCCTA TCTTTTAGAG ATTGGTTTGG TCAAATCATA CTTGGACAAG 360 

20 

GTCTTTGTTC AACGTGTGGA AAGAAAAGAT AATAAAATTA CAATTCAATT TGAAAAAGTC 420 

ACTCAACGAC TGTTTTTAGC TCAAGATTAT TTTAAAGCTT TATCCGTAAC GAACTTAAAA 480 

25 GCAGGCATCG CTGAGAATAA GGGATTAATG GAGCTTGTAT TTGATGTCCA AAATAAGAAA 540 

GATTATGAAA TTTTAGAAGG TCTGCTGATT TTTGGAGAAA GTTTATTAGA GATAAAAGAG 600 

TCTAAGGAAA AAAATTCCAT TTGATATTTT TCTTCTATAA AATAGATAAA ATGGTACAAT 660 

30 

AATAAATTGA GGTAATAAGG ATGAGATTAG ATAAATATTT AAAAGTATCG CGAATTATCA 720 

AGCGTCGTAC AGTCGCAAAG GAAGTAGCAG ATAAAGGTAG AATCAAGGTT AATGGAATCT 780 

35 TGGCCAAAAG TTCAACGGAC TTGAAAGTTA ATGACCAAGT GAAATCGCTT GGCAATAAGT 840 

TGCTGCTTGT AAAGGTACTA GAGATGAAAG ATAGTACAAA AAAAGAAGAT GCAGCAGGAA 900 

TGTATGAAAT TATCAGTGAA ACACGGGTAG AAGAAAATGT CTAAAAATAT TGTACAATTG 960 

40 

AATAATTCTT TTATTCAAAA TGAATACCAA CGTCGTCGCT ACCTGATGAA AGAACGACAA 102 0 

AAACGGAATC GTTTTATGGG AGGGGTATTG ATTTTGATTA TGCTATTATT TATCTTGCCA 1080 

45 ACTTTTAATT TAGCGCAGAG TTATCAGCAA TTACTCCAAA GACGTCAGCA ATTAGCAGAC 1140 

TTGCAAACTC AGTATCAAAC TTTGAGTGAT GAAAAGGATA AGGAGACAGC ATTTGCTACC 1200 

AAGTTGAAAG ATGAAGATTA TGCTGCTAAA TATACACGAG CGAAGTACTA TTATTCTAAG 12 60 

50 

TCGAGGGAAA AAGTTTATAC GATTCCTGAC TTGCTTCAAA GGTGATAAAA TGGAAAATTT 1320 

ATTAGACGTA ATAGAGCAAT TTTTGAGTTT GTCAGATGAA AAGCTGGAAG AATTGGCTGA 1380 

55 TAAAAATCAA TTATTGCGTT TACAAGAAGA AAAGGAAAGG AAGAATGCGT AAATTCTTAA 1440 

TTATTTTGTT GCTACCAAGT TTTTTGACCA TTTCAAAAGT CGTTAGCACA GAAAAAGAAG 1500 

TCGTCTATAC TTCGAAAGAA ATTTATTACC TTTCACAATC TGACTTTGGT ATTTATTTTA 1560 

60 

GAGAAAAATT AAGTTCTCCC ATGGTTTATG GAGAGGTTCC TGTTTATGCG AATGAAGATT 1620 
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TAGTAGTGGA ATCTGGGAAA TTGACTCCCA 
TAAATAAACA AGGAATTCCA GTATTTAAGC 

5 

AACGATTTTT ATATGATCAA TCAGAGGTAA 
CTGACTTTAA ACTGTACAAT AGTCCTTATG 
10 CTTATTCGCA AGTATCAATC GACAAGACCA 
TTGATCAGGC TGGATGGGTA GCTAAAGAAT 
AAGTTCAAGA AATGTTATCT GAAAAATATC 

15 

AACTGACTAC TGGAAAAGAA GCTGGTATCA 
TTTTGAAACT CTCTTATCTC TATTATACGC 
20 TTAGATACGA CTGTAAAATA CGTATCTGCA 
GAGGGAAGTG GTAGTCTTCC TAAAAAAGAA 
ATTACGAAAG TATCAAAAGA ATCTGATAAT 

25 

TCAAACCAAT CTGATGCCAC ATTCAAATCC 
GATCCAAAAG AAAAATTGAT TTCTTCTAAG 
30 AATCAAAATG GATTTGTGCT AGAGTCTTTG 
GCCAAAGGTG TTTCTGTTAA AGTAGCTCAT 
GATACGGGTG TTGTCTATGC AGATCCTCCA 

35 

GATTATGATA CGATTTCTAA GATAGCCAAG 
CAGATTTTTT AAATCATTTT CTCAAGAAGG 
40 TAGCTCTTTC TGGTGGATTA GATTCCATGT 
AAGAGTTAGA GATTGAATTG ATTCTAGCTC 
ATTGGGAAGA AAAGGAATTA AGGAAGTTGG 

45 

GCAATTTTTC AGGAGAATTT TCAGAAGCGC 
AAGAGGTCCA TGAAAAAGAC AGGTGCGACA 
50 CAGGTGGAAA CGATTTTTAT GCGCTTGATT 
ATTAAGGAGA AGCAAGTAGT CGGAGAGATA 
AAAAAAGACT TTCCATCAAT TTTTCACTTT 

55 

TTTCGAAATC GTATTCGAAA TTCTTACTTA 
AGGGATGCAA TCCTTAGGCA TTGGCAATGA 
60 ATTATCTAAC AATATTAATG TGGAAGATTT 



AAACAAGTTT TCAAATAACC GAGTGGCGCT 1680 

TATCAAATCA TCAATTTATA GCTGCGGACA 1740 

CTCCAACAAT AAAAAAAGTA TGGTTAGAAT 1800 

ATTTAAAAGA AGTGAAATCA TCCTTATCAG 1860 

TGTTTGTAGA AGGAAGAGAA TTTCTACATA 1920 

CAACTTCTGA AGAAGATAAT CGGATGAGTA 1980 

AGAAAGATTC TTTCTCTATT TATGTTAAGC 2040 

ATCAAGATGA AAAGATGTAT GCAGCCAGCG 2100 

AAGAAAAAAA TAAATGAGGG TCTTTATCAG 2160 

GTCAATGATT TTCCAGGTTC TTATAAACCA 2220 

GATAATAAAG AATATTCTTT AAAGGATTTA 2280 

GTAGCTCATA ATCTATTGGG ATATTACATT 2340 

AAGATGTCTG CCATTATGGG AGATGATTGG 2400 

ATGGCCGGGA AGTTTATGGA AGCTATTTAT 2460 

ACTAAAACAG ATTTTGATAG TCAGCGAATT 2520 

AAAATTGGAG ATGCGGATGG ATTTAAGCAT 2580 

TTTATTCTTT CTATTTTCAC TAAGAATTCT 2640 

GATGTTTATG AGGTTCTAAA ATGAGGGAAC 2700 

GATATTTCAA AAAGCATGCT AAGGCGGTTC 2760 

TTCTATTTAA GGTATTGTCT ACTTATCAAA 2820 

ATGTGAATCA TAAGCAGAGA ATTGAATCAG 2880 

CTGCTGAAGC AGAGCTTCCT ATTTATATCA 2940 

GTGCACGAAA TTTTCGTTAT GATTTTTTTC 3000 

GCTTTAGTCA CTGCCCACCA TGCTGATGAT 3060 

CGAGGAACCT CCTTGCGCTA TCTATCAGGA 3120 

GAAATCATTC GTCCCTTCTT GCATTTTCAG 3180 

GAAGATACAT CAAATCAGGA GAATCATTAT 3240 

CCAGAATTGG AAAAAGAAAA TCCTCGATTT 3300 

AATTTTAGAT TATGATTTGG CAATAGCTGA 3360 

ACAGCAGTTA TTTTCTTACT CTGAGTCTAC 3420 
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ACAAAGAGTT TTACTTCAAA CTTATCTGAA 
TCAGTTTGCT GAAGTTCAGC AGATTTTAAA 
5 AAATGGCTAT GAATTGATAA AAGAGTACCA 
GGCTGATGAA AAGGAAGATG AACTTGTGTT 
ATATTTATTT TCTTTTGGAC TTCCATTAGA 

10 

ACGTGAAACA TCCATACACA TTCGTCATCG 
GCATAGAAAA AAACTCAGAC GTTTATTTAT 
15 CTCTGCTCTT ATTATTGAGC AATTTGGTGA 
TAATTTGAGT AAAAAAACGA AAAATGATAT 
AGATAGGTAA AAAATGTTAG AAAACGATAT 

20 

TACAGAAGCA GCTAAAAAAC TAGGTGCTCA 
AATCTTAGTT GGGATTTTAA AAGGATCTAT 
25 TGATACACAT ATTGAAATGG ACTTCATGAT 
TAGTGGTGTT ATCAATATTA AACAAGATGT 
ATTTGTAGAA GATATCATTG ATACAGGTCA 

30 

AGAAAGAGAA GCAGCTTCTG TTAAAATTGC 
TGTAGAAATT GAGGCAGACT ATACCTGCTT 
35 TGGTTTAGAC TACAAAGAAA ATTATCGTAA 
AGTGTATTCA AATTAGAAAG AATAATCTTT 
AAATCCTTTT CTATGGTTAT TATTTATCTT 

40 

ATTCTGGGAA TAACTCAGGA GGAAGTCAGC 
TTACCGATGG TAATGTAAAA GAATTAACTT 
45 TGGTGTCTAT AAAAATCCTA AAACAAGTAA 
ATCTGTTACT AAGGTAGAGA AATTTACCAG 
AGAATTGCAA AAACTTGCTA CTGACCATAA 

50 

AAGTGGTATA TGGATTAATC TACTCGTATC 
CCTATTCTCT ATGATGGGAA ATATGGGAGG 
55 ACGTAGTAAG GCTAAAGCAG CAAATAAAGA 
TGGAGCTGAG GAAGAAAAAC AAGAACTAGT 
ACGATTCACA AAACTTGGAG CCCGTATTCC 

60 

GACAGGTAAG ACTTTGCTTG CTAAGGCAGT 
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TCGTTTTCCA 


GATTTGAATC 


TTACAAAAGC 


^480 


ATTTAAAAGC 


CAGTATCGTC 


ATCCGATTAA 


3540 


ACAGTTTCAG ATTTGTAAAA TCAGTCCGCA 


3600 


ACACTATCAA AATCAGGTAG 


CTTATCAAGG 


3660 


AGGTGAATTA ATTCAACAAA TACCTGTTTC 


3720 


AAAAACAGGA 


GATGTTTTGA 


TTAAAAATGG 


3780 


TGATTTGAAA ATCCCTATGG AAAAGAGAAA 




AATTGTCTCA ATTTTGGGAA TTGCGACCAA 




AATGAACACT 


GTACTTTATA 


TAGAAAAAAT 




TAAAAAAGTC 


CTCGTTTCAC 


ACGATGAAAT 


A f\0 f\ 


ATTAACTAAA 


GACTATGCAG 


GAAAAAATCC 


a no c\ 

4 U0 U 


TCCTTTTATG 


GCTGAATTGG 


TCAAACATAT 


a i a n 


GGTTTCTAGC 


TACCATGGTG 


GAACAGCAAG 




GACTCAAGAT 


ATCAAAGGAA 


GACATGTTCT 




AACTTTGAAG AATTTGCGAG ATATGTTTAA 


A ion 


AACCTTGTTG 


GATAAACCAG 


AAGGACGTGT 




TACTATCCCA AATGAGTTTG 


TAGTAGGTTA 


44 40 


TCTTCCTTAT 


ATTGGAGTAT 


TGAAAGAGGA 


a ^oo 


AATGAAAAAA 


CAAAATAATG 


GTTTAATTAA 


4 SfiO 


TTTCCTTGTG 


ACAGGATTCC 


AGTATTTCCT 




AAATCAACTA 


TACTGAGTTG 


GTACAAGAAA 


4 680 


ACCAACCAAA 


TGGTAGTGTT 


TCGAAGTTTC 


4740 


AGAAGGAACA 


GGTATTCAGT 


TTTTCACGCC 


4 800 


CACTATTCTT 


CCTGCAGATA 


CTACCGTATC 


4860 

HO UU 


AGCAGAAGTA 


ACTGTTAAGC 


ATGAA^GTTC 


4 9?0 


CATTGTGCCA 


TTTGGAATTC 


TATTCTTCTT 


4980 


AGGCAATGGC 


CGTAATCCAA 


TGAGTTTTGG 


5040 


AGATATTAAA 


GTAAGATTTT 


CAGATGTTGC 


5100 


TGAAGTTGTT 


GAGTTCTTAA AAGATCCAAA 


5160 


AGCAGGTGTT 


CTTTTGGAGG 


GACCTCCGGG 


5220 


CGCTGGAGAA 


GCAGGTGTTC 


CATTCTTTAG 


5280 
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TATCTCAGGT TCTGACTTTG TAGAAATGTT TGTCGGAGTT GGAGCTAGTC GTGTTCGCTC 5340 
TCTTTTTGAG GATGCCAAAA AAGCAGCACC AGCTATCATC TTTATCGACT GAAATGGATG 5400 

5 

CCCGTGGGAC GTCAACGTGG AGTCGGTCTC GGCGGAGGTA ATGACGAACG TGAACAAACC 5460 
TTGAACCAAC TTTTGATTGA GATGGATGGT TTTGAGGGAA ATGAAGGGAT TATCGTCATC 5520 
10 GCTGCGACAA ACCGTTCAGA TGTACTTGAT CCTGCCCTTT TGCGTCCAGG ACGTTTTGAT 5580 
AGAAAAGTAT TGGTTGGCCG TCCTGATGTT AAAGGTCGTG AAGCAATCTT GAAAGTTCAC 5640 
GCTAAGAACA AGCCTTTAGC AGAAGATGTT GATTTGAAAT TAGTGGCTCA ACAAACTCCA 5700 

15 

GGCTTTGTTG GTGCTGATTT AGAGAATGTC TTGAATGAAG CAGCTTTAGT TGCTGCTCGT 5760 
CGCAATAAAT CGATAATTGA TGCTTCAGAT ATGATGAAAG CAGAAGATAG AGTTATTGCT 5820 
2 0 GGACCTTCTA AGAAAGATAA GACAGTTTCA CAAAAAGAAC GAGAATTGGT TGCTTACCAT 5880 
GAGGCAGGAC ATACCATTGT TGGTCTAGTC TTGTCGACT 5919 
(2) INFORMATION FOR SEQ ID NO: 28: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1863 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

35 

<iv) ANTI-SENSE: NO 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
GAGCTCGGTA CCCGGGGATC ATACTCAAGA GGAGGTAATC CAATGAACAC TAGTCTTAAA 60 
45 CTCAGCAAAC AACTCAGTTT TGGAGAGGAG ATTGCTAATA GCGTGACCCA TGCTGTGGGT 120 
GCAGTCATCA TGCTTATCTT GCTGCCTATT TCATCCATCT ATAGTTATGA AGCACACGGA 180 
TTTTTATCCT CTATCGGCGT TTCCATTTTC GTCATCAGTC TCTTTCTCAT GTTCCTATCA 240 

50 

TCCACCATTT ATCACTCTAT GGCCTATGGT TCGACCCACA AATATGTTTT GCGAATCATT 300 
GACCATTCTA TGATTTACGT TGCCATTGCC GGCTCATACA CGCCCGTTGT CTTGACCTTG 360 
55 ATGAATAACT GGTTTGGCTA TCTGATTATT GTCATCCAAT GGGGAACGAC CATCTTTGGT 420 
ATTCTCTATA AAATCTTTGC TAAAAAGGTC AATGAGAAAT TTAGCCTTGC TCTTTACCTG 480 
ATTATGGGCT GGTTGGTTCT GGCTATCATT CCTGCCATTA TCAGTCAAAC GACACCCGTT 540 
TTCTGGAGTC TCATGGTAAC TGGCGGACTC TGTTATACAG TTGGAGCTGG ATTTTATGCC 600 



60 



WO 98726072 PCT/US97/22578 



-71- 



AAGAAAAAAC CTTATTTCCA CATGATTTGG CATCTCTTTA TCCTAGCTGC GTCCGCACTC 660 

CAATACATCG CTATTGTTTA TTACATGTAA AAAAGTTGAG AAATTCAATC TCAACTTTTT 720 

5 

TCTTTACACA TATTGATAAA GTACTGGTGC AAGCGCACAT CATCAGTCAA TTCTGGATGA 780 

AAAGAACTTA CCAACATATT TTTTTCTTGG GCTGCAACAA TTTGATTGTT CACTATTGCT 84 0 

10 AAAATTTCTA CACCCTCACC AACACTACTG ATAATCGGAC CACGGATAAA GGTCATTGGA 900 

ATCTTGCCAA CTCCCTTACA TTCTGCTTCC GTGTAGAAAC TTCCTAATTG GCGCCCATAA 960 

GCATTACGCT CGACCACCAT ATCCATAGTT CCTAGATGAC TCTCTTTCTG AGAAGTGATT 1020 

15 

TCCTTAGCCA GCAAAATTAA GCCCGCACAG GTCCCAAACA CTGGTAAGCC AGATAGAATG 1080 

GCTTCTCGTA TGGGAAGTAG CATGTTCTGG TCACGTAAGA GCTTGCCCAT GGTTGTAGAC 1140 

20 TCACCACCAG GCAAAATAAA CCCGACAAGT CACTCTGATC TTGCTGAAAA CATCTAGATT 1200 

TCTGAGTTCT ACACTCTCGA CACCTAATTG ATCTAGCACT TTTGCATGTT CTGCAAAGGC 1260 

CCCTTGCAAG GCCAATATTC CGATTTTCAT CTATTTTCCT CGTTCAGCCA TGAGAATTTG 1320 

25 

GATTCATTTT CATTAATACC AACCATGGCT TCTCCTAAAT CTTCAGAGAT TTGAGCTAGG 1380 

ATTTGAGGAT TACGGAAGTT AGTCACAGCC TTAACAATGG CACTCGCTCG TTTAACAGGA 1440 

30 TCTCCTGACT TGAAAATACC TGAACCGACA AAGACCCCCT CTGCCCCTAA TTGCATCATT 1500 

AACGCAGCAT CTGCTGGCGT TGCAACACCT CCAGCAGCGA AATTTACAAC TGGCAATTTT 1560 

CCATGTTCAT GAACATATTG GACCAATTCT ACAGGGACTT GCAAATCCTT GGCAGCAACA 1620 

35 

TAAAGCTCGT CCTCACGTAA GTTTTGAATG CGGCGAATTT CCTGATTCAT CATACGCATA 1680 

TGACGAACAG CTTGGACTAT ATCCCCTGTC CCTGGTTCTC CTTTAGTACG AATCATGGAA 1740 

40 GCACCTTCAG CGATACGACG CAAGGCTTCA CCCAAATCCT TAGCACCACA GACAAAAGGA 1800 

ACTTGGAATT CTTTCTTGTC CACATGGAAA CGGTCATCAG CTGGAGATAG AACTTCACTC 1860 

TCG 1863 
(2) INFORMATION FOR SEQ ID NO:29: 



45 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 544 e base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



55 



60 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 

TAAAGAAGGT GGATTTGAAG TTAACGGTAA ATTCATCAAA GTTTCTGCTG AACGTGATCC 60 

5 

AGAACAAATC GACTGGGCTA CTGACGGTGT AGAAATCGTT CTTGAAGCTA CTGGTTTCTT 120 

TGCTAAGAAA GAAGCAGCTG AAAAACACCT TAAAGGTGGA GCTAAAAAAG TTGTTATCAC 180 

10 TGCTCCTGGT GGAAACGACG TTAAAACAGT TGTATTCAAC ACTAACCACG ACGTTCTTGA 240 

CGGTACTGAA ACAGTTATCT CAGGTGCTTC ATGTACTACA AACTGCTTGG CTCCAATGGC 300 

TAAAGCTCTT CAAGACAACT TTGGTGTTGT TGAAGGATTG ATGACTACTA TCCACGCTTA 360 

15 

CACTGGTGAC CAAATGATCC TTGACGGACC ACACCGTGTG GTGACCTTCG CCGTGCTCGC 420 

GCTGGTGCTG CAAACATCGT TCCTAACTCA ACTGGTGCTG CAAAAGCTAT CGGTCTTGTA 480 

20 ATCCCAGAAT TGAATGGTAA ACTTGATGGA TCTGCACAAC GCGTTCCAAC TCCAACTGGA 540 

TCAGTTACTG AATTGGTCGC AGTTCTTGAA AAGAACGTTA CTGTTGATGA AGTGAACGCA 600 

GCTATGAAAG CAGCTTCAAA CGAATCATAC GGTTACACAG AAGATCCAAT CGTATCTTCA 660 

25 

GATATCGTAG GTATGTCTTA CGGTTCATTG TTTGACGCAA CTCAAACTAA AGTTCTTGAC 720 

GTTGACGGTA AACAATTGGT TAAAGTTGTA TCATGGTACG ACAACGAAAT GTCATACACT 780 

30 GCACAACTTG TTCGTACTCT TGAATACTTC GCAAAGATTG CTAAATAATT CTTGAGTTGA 840 

TAGAAAGCAA GGCTTTGTGG TCTTGCTTTT TTATATGGAA AAATGGATGA CACGATCATC 900 

CATTCTTTTT TAATTCTTTT TCAAATGTAT TTGAAAGGGT AGTGAAAGTT AGCCTCTCTA 960 

35 

AAGTAAGTGG GTGGGTAAAG GAAAGTCGGA AGGCATGAAG CATAAGCCGG CTTGTCTTTG 1020 

ATTTACTATT ATAGAGAGGG TCTCCCAGGA TAGGAAGATT ATGATGCAAA AGGTGCACAC 1080 

40 GAATCTGATG GGTTCGCCCT GTCTTTAGCT TGCAATGAGC CAAGGAAGTC TTGTTTGAGA 1140 

ATTGCTTTAA TCTGCTTACA TGCGTTTCAG CATATTTCCC ATTTTTTGCA TCAACTATTC 1200 

TTTTTCTACG ATCATGGCGA TCACGTCCAA TTTTGTCTCT GAAAACAAGT TCTTTTCTGT 1260 

TGATATTTCC ATCAACTAGA GCCCAATATT CTCTAGAAAT CTCTTTTTTC TCCAATAAGC 1320 

GATTGAGAAT GGGCAGGATA AAAGGATTTT TGGCAAAGAG AACTAAGCCA CTGGTTTCCA 1380 

50 TGTCCAGACG ATGAACGACA TAGCAGGTTT GGCCAACATA GGTACTGACA TGGTTAAGAA 1440 

GGGCAATTTC GTTTGGTTGA TTACCATGCG TTTTCATCCC CTCTGGTTTG TTTACAATAA 1500 

^ TCAAGTGTTG ATCTTGATAA ACTTCCTGCA CTAAGTCTGG GTTGCCCCAA GGGATCGTCT 1560 

TTTGGGGATA ATCTTCCTCG TCAAAAGTCA ACTGGCAAAC ATCTCCAGGA TTTACGATTT 1620 

CGTTCCAGCG GACTTCTTCT TGATTTATCA AAATATGTTT CTTGATTCTC AAAAAATGAC 1680 

60 GGATTTTTCT AGGGATGAGG AGTTGTTCCT CAAGTAATTG CTTTACCGTC ATTTGAGGTA 1740 
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GAGAGGCGGG TAATGTAAAT GTGAATTGCA 
TGGATAGGAA ATAGCTAAAT TCTTGTCTTC 
5 AAATTAGATA AATTATTTGA GAAATTTCTT 
GAGGACTCTG ATTCTACTAT CTTACGTCGC 
GTAGGTCCGA TTCGAAAATT CTGGCGTCGT 

10 

GGTTTGAGTG CAGGCTTGCT AGTTGGAATC 
GTCAATGATT TGCAAAATGC CTTGAAAACT 
15 GAGGCTGGTG CCTTGTCTGG TCAAAAGGGA 
AACTTGCAGA ATGCTGTTAT TGCGACAGAA 
AACTATGGCC GTTTCTTCTT GGCTATTGTC 

20 

ATTACCCAAC AGCTGGCTAA AAACGCCTAT 
GCGAAAGAAT TTTTCCTTGC CTTAGAATTA 
25 ACCATGTACC TTAACAACGC TTATTTTGGA 
AAGAAATACT TTGGAGTTTC TGCATCAGAA 
GGGATGCTCA AGGGGCCGGA ACTGTATAAT 

30 

CGGCGCGATA CTGTCTTGCA GAATATGGTT 
ACCGAAGCTG CTGAAGTTGA TATGACTTCG 
35 TCAGATTACC GTTACCCCTC TTATTTTGAT 
AATCTAACAG AGGAAGAGAT TGTCAATAAT 
AACTACCAAG CAAATATGCA GATTGTTTAT 

40 

GATGGAACGT TTGCTCAATC AGGAAGTGTA 
GGAGTTGTCG GTCAAGTTGC TGACAATGAT 
45 ACCCAATCAA AGCGTAGTCC TGGTTCTACA 
GTTGAAGCAG GCTGGGCTTT GAATAAGCAG 
TATAAGGTTG ATAACTATGC AGGGATCAAA 

50 

TTGGCAGAAT CGCTTAATCT ACCTGCTGTT 
GCTTTTGAGG CAGGCGAAAA ATTCGGACTC 
55 GTCGCCTTGG GAAGCGGTGT TGAAACCAAC 
TTTGCAAATG AAGGTTTAAT GCCTGAAGCT 
GGACAAGTTA TTGCGAGTCA TAAAAATTCA 

60 

GACAAGATGA CCAGTATGAT GTTGGGGACT 



-73- 

TACAGATATT GTAACAAAAA AAGCCCTATT 1800 

CTATGATGAA GATGATAAAA TAAACGCATG 1860 

TCTCTTTTTA AAAAAGAAAC AAGTGAACTA 1920 

TCTCGTAGTG ATCGAAAAAA ATTAGCCCAA 1980 

TATCATCTAA CAAAGATTAT CCTTATACTA 2040 

TATTTGTTTG CTGTAGCCAA GTCGACCAAT 2100 

CGGACTCTTA TTTTTGACCG TGAAGAAAAA 2160 

ACCTATGTTG AGCTGACTGA CATCAGTAAA 2220 

GACCGTTCTT TCTATAAAAA TGACGGGATT 2280 

ACTGCTGGAC GTTCAGGTGG TGGCTCTACC 2340 

TTATCGCAGG ATCAAACTGT TGAGAGAAAA 2400 

AGCAAAAAAT ATAGTAAGGA GCAAATTCTA 2460 

AATGGTGTGT GGGGTGTAGA AGATGCGAGT 2520 

GTGAGTCTGG ATCAAGCTGC GACTCTGGCA 2580 

CCCTTGAATT CCGTAGAAGA TTCTACTAAT 2640 

GCAGCAGGAT ATATTGATAA AAACCAAGAA 2700 

CAATTGCACG ATAAGTATGA AGGAAAAATC 2760 

GCGGTGGTTA ATGAAGCTGT TTCCAAGTAT 2820 

GGCTACCGCA TTTACACAGA GCTGGACCAA 2880 

GAAAACACAT CGCTATTTCC GAGGGCAGAG 294 0 

GCTCTCGAAC CGAAAACAGG GGGAGTTCGT 3000 

AAAACTGGAT TCCGGAATTT CAACTATGCA 3060 

ATTAAGCCTT TAGTTGTTTA TACACCAGCA 3120 

TTGGATAACC ATACCATGCA GTATGATAGC 3180 

ACAACTCGAG AAGTTCCTAT GTATCAATCC 3240 

GCCACTGTTA ATGATTTGGG TGTTGACAAG 3300 

AACATGGAAA AGGTCGACCG TGTTCTTGGT 3360 

CCTCTTCAAA TGGCTCAAGC ATACGCTGCC 3420 

CATTTTATTA GTAGAATTGA AAATGCTAGT 3480 

CAAAAACGGG TGATTGATAA GTCTGTAGCT 3540 

TTCACCAACG GTACCGGTAT TAGTTCATCG 3600 
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CCTGCAGACT ATGTCATGGC AGGGAAAACT 
TACACAAGTG ACCAGTGGGT AATTGGTTAT 

5 

GGCTTTCCGA CCACTGATGA AAATCACTAT 
CATGTCTTTA GAAACATTGC CAATACTATT 
10 GTTGAAAATG CTTATAAGCA AAATGGAATT 
ACCAATGATA ATAGCCAGAC AGATGATAAT 
CTAGTAGATG AGGCTAGCCG GGCTATCTCA 

15 

ATATGGGATT CGATAGTCAA TCTATTTCGC 
TTATAATGGA TAAGATGGAG GCGTTATGGC 
20 TTGTGGTTCG AGAAACTATT CAATCAAGAT 
AGTAAATAAA TTTTGTAAGC ATTGTGGCAA 
GAGAGCGATG CGTTTTATTG GAGATATTTT 

25 

TCGCAAGGAA AGCTGGAGAG ATTTTCGTTC 
AATTATTTAC ATTTTTGACC AGTTGATTGT 
30 TTAGAAGATT AGTGGAGTTA ATTACACTAG 
GGATAGTTTT GATAAAGGAT GGTTTGTTTT 
GAAAGAAAAT CTATTACAAC GTGCACAAAC 

35 

TGAAATTCCA ACACAAACAG TGCAAGTTGA 
AAATCGCTTT CCAGGTTATG TTCTTGTAGA 
40 TGTTCGAAAC GCACAGAGTC CTACAAAATT 
TGAAGAGGTT CGTTCATTAT TAAATGAGGC 
AAATCGTGAA ACTCACAAGT TAATTGCAGA 

45 

TACACAAATT AAAGCTCTTT ACGAAACAGG 
CTCATGCACT ATCCTATGAT GAAGTAAAGT 
50 AGAGGCTGGA GCCTCTCTTT TTTGTGCAGT 
ATGGAACAAA TGTGTTTTCT AATCTGTTAG 
AAAGAATTGT ATGAAGAAGT CCAAGGGACT 

55 

CATTTATGGG AATTGTCGGA TTGGGACCAA 
AGTAGAGAAG AAGGACTGGT AGACGATATT 
60 TTTCGAAATC GAATTTTAGA CTATATCCGT 
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GGAACAACTG AAGCAGTTTT 


CAATCCGGAG 


3660 


ACTCCGGATG 


TAGTGATTAG 


CCACTGGCTT 


3720 


CTAGCTGGCT 


CTACTTCAAA 


CGGTGCAGCT 


3780 


TTACCTTATA 


CGCCAGGAAG 


TACCTTTACG 


3840 


GCACCAGCCA ATACAAAAAG ACAAGTACAA 


3900 


TTGTCTGATA 


TTCGAGGGCG 


TGGGCAAAGT 


3960 


GATGCGAAGA 


TTAAGGAAAA 


GGCTCAAACA 


4020 


TAAGATGCTT 


GTCAAAGCCT 


AGCTTTCTTG 


4080 


ACTAAAAAAA 


GCAAGCCTAG 


CTTGTGCGGT 


4140 


CAGCGGAAAC 


CCCAAGCCTA 


CACGACTAGA 


4200 


GTACACTACA 


CACAGAGAAA 


CGAGATAGGA 


4260 


TAGACTTCTT 


AAAGACACAA 


CATGGCCAAC 


4320 


TATCATGGAA 


TACACAGCTT 


TCTTTGTAGT 


4380 


TTCAGGTTTG 


ATTCGATTTA 


TTAACATTTT 


4440 


AAATCTTCTA 


TTTATGAAAG 


GAAATATCAT 


4500 


ACAAACTTAT 


TCTGGTTATG AAAATAAGGT 


4560 


CTACAATATG 


TTGGATAATA 


TTCTACGCGT 


4620 


AAAAAATGGA AAGAGAAAAG AAGTAGAAGA 


4680 


AATGGTCATG 


ACAGATGAAG 


CTTGGTTTGT 


4740 


CATTTCAGAA 


CAAACAGCTT 


ATGAAATTGA 


4800 


ACGAAATAAA GCTGCTGAAA TTATTCAGTC 


4860 


AGCATTATTG 


AAATACGAAA 


CATTGGATAG 


4920 


AAAGATGCCT 


GAAAGCAGTA 


GAAGAGGAAT 


4980 


CAAAAATGAA 


TGACGAAAAA 


TAACCCTGAG 


5040 


TTAGGAGCTA 


AAGGGAACAG 


AATGGAGAAA 


5100 


ACTGTATCTA 


GAAAGGGGAA 


AATTATGATT 


5160 


GTGTATAAGT 


GTAGAAATGA ATATTACCTT 


5220 


GAAGGCATGC 


TCTGCTTACA 


TGAATTGATT 


5280 


CCACGTTTAA 


GGAAATATTT 


CAAAACCAAG 


5340 


AAGCAGGAAA 


GTCAGAAGCG 


TAGATACGAT 


5400 
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AAAGAACCCT ATGAAGAAGT GGGTGAGATC CCCGGTACCG AGCTCGAA 5448 
(2) INFORMATION FOR SEQ ID NO:30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1040 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TAGTGAATTC GAGCTCGGTA CCCGGGGATC GTTTCTCGGT TCTTTTGGAG CACAAGGGCA 60 

TCCATCCCAT TGTCTATATT TCCAAAATGG ATTTGTTGGA AGATAGGGGA GAACTGGATT 120 

TTTACCAGCA GACCTATGGT GACATCGGCT ATGACTTTGT GACCAGTAAA GAGGAACTCC 180 

TGTCTTTGTT AACAGGCAAG GTTACGGTCT TTATGGGGCA GACAGGTGTT GGGAAGTCAA 240 

CTCTTCTCAA TAAAATCGCA CCAGACCTCA ATCTTGAAAC GGGAGAAATT TCAGACAGTC 300 

TAGGTCGCGG TCGCCATACC ACTCGAGCTG TTAGTTTTTA CAATCTCAAC GGGGGTAAAA 360 

TCGCAGATAC ACCAGGATTT TCATCCTTGG ACTATGAAGT ATCAAGGGCT GAAGACCTCA 420 

ATCAGGCTTT CCCAGAGATT GCTACTGTTA GCCGAGATTG TAAGTTCCGT ACTTGTACCC 480 

ATACCCATGA GCCGTCTTGT GCCGTCAAAC CAGCTGTTGA AGAGGGTGTT ATTGCAACCT 540 

TCCGTTTTGA CAATTACCTG CAATTCCTTA GTGAAATTGA AAATCGTAGA GAAACCTATA 600 

AAAAAGTCAG CAAAAAAATT CCAAAATAAG GAGAAACCTA TGTCTCAATA CAAGATTGCT 660 

CCGTCAATTC TGGCAGCAGA TTATGCCAAC TTTGAACGTG AAATCAAACG TCTAGAAGCA 720 

ACTGGGGCAG AATATGCCCA TATCGATTCT GGACAGTCAT TTTGTACCGC AAATCAGTTT 780 

TGGTGCAGGT GTGGTCGAGA GCTTCGTCCT CATAGTAAGA TGGTTTTCGA TTGCCACTTG 840 

ATGGTGTCAA ACCCTGAGCA TCATCTGGAA GATTTTGCGC GTGCAGGTGC AGACATCATC 900 

AGTATCCATG TAGAAGCAAC ACCTCATATT CATGGCGCCC TCCAAAAAAT TCGTTCACTC 960 

GGAGTTAAGC CTTCAGTCGT TATCAATCCT GGCACACCAG TTGAAGCCAT CAAGCACGTC 1020 

CTTCATCTAG TGACAAGTTT 1040 

(2) INFORMATION FOR SEQ ID NO:31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

10 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
ATATCACGAC GGAGCCATAC TACCGATTTT CTTAAGCATA GCGCCACCTT TACCGATGAT 60 
AATCCCTTTT TGGCTATCGC GCTCGACCAT GATGGTTGCA CGGATGTGAA CCTTGTCTGT 120 
20 CTCTTCGTCT CGTTTCATAG AGTCAACAAC TACTGCTACA GAATGCGGAA TCTCTTCACG 180 
AGTTAGGTGC AAGACTTTCT CGCGAACCAT TTCTGAAACT AAGAAACGTT CTGGATGATC 240 
TGTGATTTGA TCAGACGGGA AATATTGGAA ACCTTCATCC AGATTTTCAC TCAAAATATC 300 

25 

CACTAGACGA GACACGTTAT TTCCCTGAAG GGCTGAGATT GGAACAATTT CCTTAAAGTC 360 
CATTTGATTA CGGAAGTCAT CAATCTGAGA CAAGAGCTGG TCTGGATGGA CCTTATCGAT 420 
30 TTTATTCACC ACCAAAATCA CAGGAACCTT GGCAGCCTTG GAGACGCTCG ATAATCATAT 480 
CGTCCCCCTT ACCACGCGCT TCATCAGCAG GCACCATGAA AAGAACAGTG TCCACTTCGC 540 
GAAGGTACTG TAGGCAGACT CAACCATGAA ATCTCCGAGA GCTGTTTTAA GTTTGTGAAT 600 

35 

CCCTGGTGTG TCGATAAAGA CAATTTGCTC CTTATCAGTC GTGTAATTCC CATGATTTTA 660 
TTGCGCGTTG TCTGCGCCTT GTCACTCATG ATGGCAATCT TTTGCCCCAT AACGTGATTT 720 
40 AAAAAGGTTG ACTTCCCAAC ATTGGGACTC CTAAAATGGC TACAAACCTG ATTTAAAATT 780 
CATAATTCC 789 
(2) INFORMATION FOR SEQ ID NO: 32: 

45 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1233 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
50 (D) TOPOLOGY: linear 



55 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 



WO 98/26072 



PC77US97/22578 



-77- 



TGAGATAAAC 


TTGCGACTCA 


TATGAGAATA 


TGAATCAAGC 


CGTCCTCGTG 


AACCCCGATA 


60 


TCAACGAAGG 


CACCGAAGTC 


AACAACGTTA 


CGCACAACAC 


CTTCTAGCTT 


CTGACCTACC 


120 


ACTAGGTCTT 


TGATATCTAG 


GACATCTTGG 


CGAAGCACAG 


GTGCGTCAAA 


GGAATCACGG 


180 


AAATCTCGAC 


CTGGTTTGAG 


AAGATCTGCA 


ATGATATCTT 


TAAGGGTTTC 


TGGACCGAGG 


240 


TCTAGCTCTT 


GAGCCATTTC 


CTTGACTGAA 


AGGGACTTGA 


GTTTGCTTGG 


GCTTCTTCGT 


300 


TTAGGTCTTT 


AATATCTAAA 


CGTTTGAAGA 


GTCCTTAACT 


GCAGTGTAAT 


TCTCTGGGTG 


360 


AACTCCTGTA 


TTATCAAGGA 


TATTGCTACT 


TTCAGGGATA 


CGAAGGAAAC 


CAGCAGCCTG 


420 


CTCAAAGGCC 


TTGGCTCCCA 


GACGAGGAAC 


TTTCTTGATT 


TGGGCGCGTG 


AAGTGATTTT 


480 


TCCTTCTTCC 


TCGCGGTATT 


TGACAATATT 


TTCAGAGATA 


GTTTTGTTGA 


GTCCAGCTAC 


540 


GTGTGAAAGA 


AGAGCTGGGC 


TAGCTGTATT 


GACATTGACA 


CCAACTTGGT 


TAACCACTGT 


600 


ATCGACAACA 


AAGTCCAGAC 


TCTCAGATAG 


TTTCTTCTGA 


CTGACATCGT 


GTTGGTATTG 


660 


ACCGACACCA 


ATTGACTTAG 


GATCGATTTT 


GACCAATTCC 


GCAAGAGGAT 


CTTGCAAACG 


720 


ACGGGCGATA 


GAAATGGCAG 


AGCGTTTTTC 


AACGGTCAAG 


TCTGGAAACT 


CCTGACGAGC 


780 


AAGTTCGCTG 


GCAGAATAGA 


CAGAAGCACC 


ACTTTCATTA 


ACGATAACAT 


AGCTGACTTC 


840 


AGGGAAATCT 


TTCAGAACTT 


CCGCTACAAA 


AGCTTCACTT 


TCACGACTGG 


CCGTTCCATT 


900 


TCCAATGGCA 


ATAATCTCTA 


CACCGTATTG 


ACCAATTAAA 


TCTGCTAAAT 


CTTTCTTGGC 


960 


TTCTTCGATT 


TGACGAGCTG 


ATGCTGGTTT 


AACAGGATAA 


ATAACCTGAG 


TTGTCAGCAT 


1020 


TTTTCCTGTT 


GCATCCACGA 


CAGCTAACTT 


GGCACCTGTA 


CGAAAGGCTG 


GGTCAAATCC 


1080 


AAGAACCACG 


CGCCCTTTCA 


GTGGAGCAAC 


CAAGAGGAGA 


TTGCGCAGAT 


TGTCAGAAAA 


1140 


AAGTTGGATA 


GCTCCCTCTT 


CAGCTTTCTC 


AGTTAATTCT 


GTCCGAATAC 


GACGCTCAAT 


1200 


AGCAGGCAAG 


ACCTTTTTCT 


TAACGGATTG 


CTG 






1233 


(2) INFORMATION FOR SEQ ID NO: 33: 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6679 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDS DN ESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
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ACAAGGCGTC ATCCGTGTAT TTCTTAAATA 
CTTGAAACGT GTTTCTAAAC CAGGACTTCG 

5 

AGTTCTTAAC GGACTTGGAA TTGCCATCCT 
AGAAGCACGC CAAAAGAATG TTGGTGGTGA 
10 TACAAAGCTC GTAAAGAACA AAGCAAAATT 
GCCAACTTAT CTATTTTGCA CAGTTCTTAG 
AAGTATCTGA ACCCCGTGAA AACTGGCCGT 

15 

ACATGTCACG TATTGGTAAT AAAGTTATCG 
ATGACAACGT TGTAACTGTA AAAGGACCTA 
20 ATATTGAAAT CCGTGTGGAA GGTACTGAAG 
AAATGAAAAC TATCCACGGA ACTACTCGTG 
CAGAAGGATT CAAGAAAGAA CTTGAAATGC 

25 

GATCTAAACT TGTTTTGGCT GTTGGTAAAT 
GAATTACTTT TGAACTTCCA AACCCAACAA 
3 0 TAGTTGGTCA AACAGCTGCT TACGTACGTA 
AAGGTATCCG TTACGTTGGT GAATTCGTTC 
GTTGAGTGGT TGATCATCAA CCACCAACCT 

35 

AAAACTAAAG AGGTGAAAAC TGTGATTTCA 
CGCCACCGTC GCGTTCGGGA AAACTCTCTG 
40 TCCGTTCTAA TACAGGCATC TACGCTCAAG 
CAAGTGCTTC AACTCTTGAT AAAGAAGTTT 
CTGTCGGTAA ACTCGTTGCA GAACGTGCAA 

45 

ACCGCGGTGG ATATCTATAT CACGGACGTG 
ACGGATTGAA ATTCTAATAG GAGGACACTA 
50 ATTAGAAGAA CGCGTAGTTG CTGTCAACCG 
TCTTCGTTTC GCAGCTCTTG TTGTTGTTGG 
TGGTAAAGCT CAAGAAGTTC CAGAAGCAAT 

55 

CTTGATCGAA GTTCCTATGG TTGGAACAAC 
TGGAGCTAAA GTATTGTTGA AACCTGCTGT 
60 AGTTCGTGCC GTTGTGGAAT TGGCAGGTGT 



CGGACCAAAT GGTGAGAAAG TTATCACTAA 60 

TGTCTACAAA AAACGTGAAG ACCTTCCAAA 120 

TTCAACTTCT GAAGGTTTGC TTACTGATAA 180 

GGTTATCGCT TACGTTTGGT AAAATCAAGA 240 

AGGAAGTTGG AGAAGTTTGT TTACAAACAA 300 

ATCGTGTTCA GTTCAGCTCT TGAACTAAAT 360 

TCTGGCTGAC AATTTAACAG GAGAAAATAA 420 

TGTTGCCTGC TGGTGTTGAA CTCGCTAACA 480 

AAGGAGAACT TACTCGTGAG TTCTCAAAAG 540 

TAACTCTTCA CCGTCCAAAC GATTCAAAAG 600 

CCCTTTTGAA CAACATGGTT GTTGGTGTAT 660 

GTGGGGTTGG TTACCGTGCA CAGCTTCAAG 720 

CTCATCCAGA CGAAGTTGAA GCTCCAGAAG 780 

CAATCGTTGT TAGCGGAATT TCAAAAGAAG 840 

GCCTTCGTTC ACCAGAACCA TATAAAGGTA 900 

GTCGTAAAGA AGGTAAAACA GGTAAATAAT 960 

ATTTTCCAAC TTTGTGCATA GCAACGATTT 1020 

AAACCAGATA AAAACAAACT CCGCCAAAAA 1080 

GAACTGCTGA TCGCCCACGT TTGAACGTAT 1140 

TGATTGATGA CGTAGCGGGT GTAACGCTCG 1200 

CAAAAGGAAC TAAAACTGAA CAAGCCGTTG 1260 

ACGCTAAAGG TATTTCAGAA GTGGTGTTCG 1320 

TGAAAGCTTT GGCTGATGCA GCTCGTGAAA 1380 

GAAAATGGCA TTTAAAGACA ATGCAGTTGA 1440 

TGTTACAAAA GTTGTTAAAG GTGGACGTCG 1500 

TGACCACAAT GGTCGCGTAG GATTTGGTAC 1560 

CCGTAAAGCA GTAGATGATG CTAAGAAAAA 1620 

AATCCCACAC GAAGTTCTTT CAGAATTCGG 1680 

AGAAGGTTCT GGAGTTGCCG CTGGTGGTGC 1740 

GGCAGATATT ACATCTAAAT CACTTGGTTC 1800 



10 
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TAACACTCCA ATCAACATTG TTCGTGCAAC TGTTGAAGGT TTGAAACAAT TGAAACGCGC 1860 

TGAAGAAATT GCTGCCCTTC GTGGTATTTC AGTTTCTGAT TTGGCATAAG AAAGGGGATA 1920 

AAATGGCTCA AATTAAAATT ACTTTGACTA AGTCTCCAAT CGGACGCATT CCATCACAAC 1980 

GTAAAACTGT TGTAGCACTT GGACTTGGCA AATTGAACAG CTCTGTTATT AAAGAAGATA 2040 

ACGCTGCTAT CCGTGGTATG ATTACAGCAG TATCTCACTT AGTAACAGTT GAAGAAGTAA 2100 

ACTAATGAAG TTTTAGGGGA TGTGCACTGT ACCATCCCCT AAAACTAGAT ATAGTCATCT 2160 

ATGATGACAT CGTATAGGCG AGTTGATGGG GGAGAGAACC TTTTCTCCCT TATCGGCGCT 2220 

15 AGCATTTTAC AAAAGAGGAG AAAATAAAAA TGAAACTTCA TGAATTGAAA CCTGCAGAAG 2280 

GTTCTCGTAA AGTACGTAAC CGCGTTGGTC GTGGTACTTC ATCAGGTAAC GGTAAAACAT 2340 
CTGGTCGTGG TCAAAAAGGT CAAAAAGCTC GTAGCGGTGG CGGAGTTCGC CTTGGTTTTG . 2400 

20 

AAGGTGGACA AACTCCATTG TTCCGTCGTC TTCCAAAACG TGGATTCACT AACATCAACG 2460 

CTAAAGAATA CGCAATTGTG AACCTTGACC AATTGAACGT CTTTGAAGAT GGTGCTGAAG 2520 

25 TAACTCCAGT TGTTCTTATC GAAGCAGGAA TTGTTAAAGC TGAAAAGTCA GGTATTAAAA 2580 

TTCTTGGTAA CGGTGAGTTG ACTAAGAAAT TGACTGTGAA AGCAGCTAAA TTCTCTAAAT 2640 

CAGCTGAAGA AGCTATCACT GCTAAAGGTG GTTCAGTAGA AGTCATCTAA GAGAGGTGAC 2700 

30 

CTATGTTTTT TAAATTATTA AGAGAAGCTC TTAAAGTCAA GCAGGTTCGA TCAAAAATTT 2760 

TATTTACAAT TTTTATCGTT TTGGTCTTTC GTATCGGAAC TAGCATTACA GTTCCTGGTG 2820 

35 TGAATGCCAA TAGCTTGAAT GCTTTAAGTG GATTATCCTT CTTAAACATG TTGAGCTTGG 2880 

TGTCGGGGAA TGCCCTAAAA AACTTTTCGA TTTTTGCCCT AGGAGTTAGT CCCTATATCA 2940 

CCGCTTCTAT TGTTGTCCAA CTCTTGCAAA TGGATATTTT ACCCAAGTTT GTAGAGTGGG 3000 

40 

GTAAACAAGG GGAAGTAGGT CGAAGAAAAT TGAATCAAGC TACTCGTTAT ATTGCTCTAG 3060 

TTCTCGCTTT TGTGCAATCT ATCGGGATTA CAGCTGGTTT TAATACCTTG GCTGGAGCTC 3120 

45 AATTGATTAA AACTGCTTTA ACTCCACAAG TTTTTCTGAC GATTGGTATC ATCTTAACAG 3180 

CTGGTAGTAT GATTGTCACT TGGTTGGGTG AGCAAATTAC AGATAAGGGA TACGGAAACG 3240 

GTGTTTCCAT GATTATCTTT GCCGGGATTG TTTCCTCAAT TCCAGAGATG ATTCAGGGCA 3300 

50 

TCTATGTGGA CTACTTTGTG AACGTCCCAA GTAGCCGTAT CACTTCATCT ATCATTTTCG 3360 

TAATCATTTT GATTATTACT GTATTGTTGA TTATTTACTT TACAACTTAT GTTCAACAAG 3420 

55 CAGAATACAA AATTCCAATC CAATATACTA AGGTTGCACA AGGTGCTCCA TCTAGCTCTT 3480 

ACCTTCCGTT AAAGGTAAAT CCTGCTGGAG TTATCCCTGT TATCTTTGCC AGTTCGATTA 3540 

CTGCAGCGCC TGCGGCTATT CTTCAGTTTT TGAGTGCCAC AGGTCATGAT TGGGCTTGGG 3600 

TAAGGGTAGC ACAAGAGATG TTGGCAACTA CTTCTCCAAC TGGTATTGCC ATGTATGCTT 3660 
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TGTTGATTAT TCTCTTTACA TTCTTCTATA 
CAGAGAGCCT ACAAAAGAGT GGTGCCTATA 

5 

AAGAATATAT GTCTAAACTT CTTCGTCGTC 
TGATTTCCAT TTTACCGATT GCAGCTAAAG 
10 TTGGTGGAAC AAGTCTCTTG ATCATTATCT 
AAGGTTACCT ATTGAAACGT AAGTATGTTG 
TACTGAATCA GTAAATACTG AGGGAGTGGA 

15 

ATCTCCCCTC TTCTATTTTG TTTTTAAATC 
AAACAAAATA AGGAGATCAA ATCATGAATC 
20 AGGGAACTCA AGCAGCAAAA ATCGTAGAAC 
ATATGTTCCG CGCTGCAATG GCAAATCAAA 
TTGACAAGGG TGAATTGGTT CCTGACGAAG 

25 

CACAAGATGA TATTAAAGAA ACAGGATTCT 
AAGCTCATGC CTTGGACAAA ACATTGGCTG 
30 ATATTGAAGT GAACCCTGAC AGCCTCTTGG 
TAACTGGAGA AACTTTCCAC AAGGTCTTTA 
ACTACCAACG TGAAGATGAT AAGCCTGAGA 

35 

CTCAAGGAGA ACCAATCATT GCTCACTACC 
GTAATCAAGA TATCAATGAT GTCTTCTCAG 
40 AAAGCGTTTT TCACACTTGC AAAAATCCGC 
ATAATTGTTG TCTCTGTGTC TAGAGGCATC 
GGCAAAAGAC GATGTGATTG AAGTTGAAGG 

45 

GTTTACGGTT GAACTTGAAA ATGGACATCA 
TAAAAACTAT ATTCGTATTT TAGCGGGAGA 
50 CTTGACACGT GGACGTATCA CTTACCGCTT 
AATGAAAGTA AGACCATCGG TCAAACCAAT 
TGGTCGTGTT ATGGTAATTT GCCCAGCAAA 

55 

TAGAAAGGAG AAAACATGGC TCGTATTGCT 
GTAATCTCAT TGACTTATGT TTATGGTATC 
60 GCTGCTGGAA TCTCAGAAGA TGTTCGTGTA 



CGTTTGTACA GATTAATCCT GAAAAAGCAG 3720 

TCCATGGAGT TCGTCCTGGT AAAGGTACAG 3780 

TTGCAACTGT TGGTTCCCTC TTCCTTGGTG 3840 

ATGTATTTGG TCTTTCTGAT GTTGTTGCCT 3900 

CTACAGGTAT CGAAGGAATC AAGCAATTGG 3960 

GTTTCATGGA CAGAACAGAA TAAAAGTATT 4020 

GGTTTAAACT CTGACATTTG TAAGAGTTGG 4080 

GGGGTGAAAA AACTTTTTGC TTCTATTTAA 4140 

TTTTGATTAT GGGCTTACCT GGTGCAGGTA 4200 

AATTCCATGT TGCACATATC TCAACAGGTG 4260 

CTGAAATGGG TGTTCTTGCT AAGTCATATA 4320 

TTACAAATGG AATCGTAAAA GAACGCCTTT 4380 

TATTGGATGG TTACCCACGT ACAATTGAAC 4440 

AACTTGGCAT TGAACTAGAA GGTATTATCA 4500 

AACGTTTGAG TGGCCGTATC ATCCACCGCG 4560 

ACCCACCAGT TGACTATAAA GAAGAAGATT 4620 

CAGTAAAACG TCGTTTGGAT GTTAATATTG 4680 

GTGCCAAAGG TTTGGTTCAT GACATCGAAG 4740 

ATATTGAAAA AGTATTGACA AATTTGAAAT 4800 

TACAAATGTT ATACTGAAAT AGTCTGACTT 4860 

GAATCGAAAT TTATGGAGGT GCTTTTGCGT 4920 

CAAAGTAGTT GATACAATGC CGAATGCAAT 4980 

GATTTTAGCA ACAGTTTCTG GTAAAATTCG 5040 

TCGTGTTACT GTCGAAATGA GTCCATATGA 5100 

TAAATAATCG AAAAACTTGG AGGGATAAGA 5160 

TTGCGAATAC TGTAAAGTTA TTCGTCGTAA 5220 

TCCAAAACAC AAACAACGTC AAGGATAAGA 5280 

GGAGTTGATA TTCCAAATGA CAAACGCGTA 5340 

GGACTTGCAA CATCTAAGAA AATTTTGGCT 5400 

CGTGATCTTA CATCAGATCA AGAAGATGCT 5460 
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ATCCGTCGTG AAGTGGATGC AATCAAAGTT GAAGGTGACC TTCGTCGTGA AGTAAACTTG 5520 

AACATCAAAC GTTTGATGGA AATCGGTTCA TACCGTGGTA TCCGTCACCG TCGTGGACTT 5580 

CCTGTCCGTG GACAAAATAC TAAAAACAAC GCTCGCACTC GTAAAGGTAA AGCTGTTGCG 5640 

ATTGCTGGTA AGAAAAAATA ATATAGGAGG TAAAAGTCTT GGCTAAACCA ACACGTAAAC 5700 

GTCGTGTGAA AAAGAATATC GAATCTGGTA TTGCTCATAT TCACGCTACA TTTAATAACA 5760 

CTATTGTTAT GATTACTGAT GTGCATGGTA ATGCAATTGC TTGGTCATCA GCTGGTGCTC 5820 

TTGGTTTCAA AGGTTCTCGT AAATCTACAC CATTCGCTGC TCAAATGGCT TCTGAAGCTG 5880 

15 CTGCTAAATC TGCACAAGAA CACGGTCTTA AATCAGTTGA AGTTACTGTA AAAGGTCCAG 5940 

GTTCTGGTCG TGAGTCAGCT ATTCGTGCGC TTGCTGCCGC TGGTCTTGAA GTAACAGCAA 6000 

TTCGTGATGT GACTCCAGTG CCACACAATG GTGCTCGTCC TCCAAAACGT CGCCGTGTAT 6060 

20 

AATCATCGCA TTACACTGCT TTTCGTTTAA GAGGGAGTAA CTAAATGATC GAGTTTGAAA 6120 

AACCAAATAT AACAAAAATT GATGAAAATA AAGATTATGG CAAGTTTGTA ATCGAACCAC 6180 

25 TTGAACGTGG CTACGGTACA ACTCTTGGTA ACTCTCTTCG TCGTGTACTT CTAGCTTCTC 6240 

TACCAGGAGC AGCTGTGACA TCTATCAACA TTGATGGTGT GTTACATGAG TTTGACACAG 6300 

TTCCAGGTGT TCGTGAAGAC GTGATGCAAA TCATTCTGAA CATTAAAGGA ATTGCAGTGA 6360 

30 

AATCGTACGT TGAAGACGAA AAAATCATCG AACTGGATGT TGAAGGTCCT GCTGAAGTAA 6420 

CAGCTGGTGA CATTTTGACA GATAGCGATA TTGAAATTGT AAATCCAGAT CATTATCTCT 6480 

35 TTACAATCGG TGAAGGTTCT TCTCTAAAAG CGACTATGAC TGTTAACAGT GGTCGTGGAT 6540 

ATGTACCTGC TGACGAAAAT AAAAAGGATA ATGCACCAGT TGGAACACTT GCTGTAGATT 6600 

CTATTTATAC ACCAGTTACA AAAGTCAACT ATCAAGTGGA ACCTGCTCGT GTAGGTAGCA 6660 

40 

ATGATGGTTT CGACTCTAG 6679 
(2) INFORMATION FOR SEQ ID NO: 34: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1703 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

50 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
55 (iv) ANTI-SENSE: NO 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



10 



WO 98/26072 PCT/US97/22578 

-82- 

AGAATACCTT GGGGCAACTG TTCAAGTCAT TCCTCATATC ACAGATGCTT TGAAAGAAAA 60 

AATCAAGAGT GCCGCTCTAA CGACCGACTC TGATGTCATT ATCACAGAGG TTGGTGGAAC 120 

AGTAGGAGAT ATCGAGTCCT TGCCATTCCT AGAGGCTCTT CGCAGATGAA GGCAGATGTG 180 

GGGCGGATAA TGTCATGTAT ATCCATACAA CCTTCTTCCT TACCTCAAGG CTGCTGGTGA 240 

AATGAAACCA AACCAACCCA ACACTCTGTC AAAGATTGCG TGGCTTGGGA ATCCAACCAA 300 

ATATGTTGGT TATTCGTACA GAAGAGCCAG CTGGTCAAGG AATTAAAAAT AAACTGGCCC 360 

AGTTCTGTGA TGTGGCACCA GAATCCCTAA TCGAATCGTT GGATGTTGAA CACCTTTACC 420 

15 AAATTCCACT GAACTTGCAG GCACAAGGGA TGGACCAAAT TGTTTGTGAT CATTTGAAAT 480 

TAGACGCACC AGCAGCGGAT ATGACAGAAT GGTCAGCCAT GGTGGACAAG GTCATGAACC 540 

TCAAGAAACA AGTTAAGATT TCCCTTGTTG GTAAGTATGT GGAGTTGCAA GATGCCTATA 600 

20 

TCTCAGTGGT CGAAGCCTTG AAACACTCTG GCTATGTCAA TGATGTAGAA GTTAAAATCA 660 

ATTGGGTCAA TGCCAATGAT GTGACAGCAG AGAATGTAGC AGAACTCTTG TCTGATGCGG 720 

25 ACGGGATCAT CGTACCAGGT GGTTTTGGTC AACGTGGTAC AGAAGGGAAA ATCCAAGCCA 780 

TCCGCTATGC GCGTGAAAAT GATGTTCCAA TGTTGGGAGT CTGCTTGGGA ATGCAGTTGA 840 

CATGTATCGA GTTTGCTCGT CACGTTTTAG GTCTTGAAGG TGCCAATTCT GCAGAGCTTG 900 

30 

CACCAGAAAC AAAATACCCT ATCATT GATA TCATGCGTGA TCAGATTGAT ATTGAGGATA 960 

TGGGTGGAAC CCTTCGTTTG GGACTTTATC CGTCTAAGTT GAAACGTGGC TCTAAGGCTG 1020 

35 CTGCTGCTTA TCACAATCAA GAAGTGGTGC AACGCCGTCA CCGTCACCGT TATGAGTTTA 1080 

AATAATGCCT TCCGTGAGCA GTTTGAGGCA GCAGGTTTGT CTTTTCAGGA GTTTCTCCAG 1140 

ACAATCGTTT GGTAGAAATC GTGGAAATCC TGAAAATAAA TTCTTTGTAG CTTGTCAGTA 1200 

40 

TCACCCTGAA CTGTCAGCCG TCCAACCGAC CAGAAGAACT CTACACTGCC TTTGTTACTG 1260 

CAGCGGTTGA GAACAGCAAT TAGCAAAATC AGAACCTTTG AGAAAAATCT CAGAGGTTTT 1320 

45 TTGCATACGA TGATATTGCA GTATATCTGA GGTAGGAGTC CTCTGTATGT ACCTGCTACC 1380 

GTTGAAATCA ATAGCGACTC CCTCTTGCCC TGTGCTAGTG AATGGATTTA TCAGTATATT 1440 

GAAATGAAAT AAAATTTGAA CAAATTAATT CGGAAAGCCA AATCAATTTC TAGCAAAGTT 1500 

50 

TTAGGAACTG GATTGTATAG TGAATTGAAA TAAGATGTGA ACATCTCTAT CAGGAAAGTC 1560 

AAATTAATTT ATAGAAATAT TTTAGCAGTC AAGATGGACT GTTATAGATT CAATATACTA 1620 

55 TACTTTTTTA ATTTAATCCA CTATAATAAA ATGAAATAAT AACAGGACAA ATCGTTCAGG 1680 

ACAGTCAAAT CGACTCTAGA GGA 1703 
(2) INFORMATION FOR SEQ ID NO: 35: 
(i) SEQUENCE CHARACTERISTICS: 



60 



WO 98/26072 PCT/US97/22578 



-83- 



(A) LENGTH: 1620 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
10 (iv) ANTI-SENSE: NO 



15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

ATTGTAAAAC ACCAAGGAAA AACAGCTAAA GAAGCGAAAG AATTGGCCAT TGACTACATG 60 

AATAAGGTTG GCATTCCAGA CGCAGATAGA CGTTTTAATG AATACCCATT CCAATATTCT 120 

20 

GGAGGAATGC GTCAACGTAT CGTTATTGCG ATTGCCCTTG CCTGCCGACC TGATGTCTTG 180 

ATCTGTGATG AGCCAACAAC TGCCTTGGAT GTAACTATTC AAGCTCAGAT TATTGATTTG 240 

25 CTAAAATCTT TACAAAACGA GTATCATTTC ACAACAATCT TTATTACCCA CGACCTTGGT 300 

GTGGTGGCAA GTATTGCGGA TAAGGTAGCG GTTATGTATG CAGGAGAAAT CGTTGAGTAT 360 

GGAACGGTTG AGGAAGTCTT CTATGACCCT CGCCATCCAT ATACATGGAG TCTCTTGTCT 420 

30 

AGCTTGCCTC AGCTTGCTGA TGATAAAGGG GATCTTTACT CAATCCCAGG AACACCTCCG 480 

TCACTTTATA CTGACCTGAA AGGGGATGCT TTTGCCTTGC GTTCTGACTA CGCAATGCAG 540 

35 ATTGACTTCG AACAAAAAGC TCCTCAATTC TCAGTATCAG AGACACATTG GGCTAAAACT 600 

TGGCTTCTTC ATGAGGATGC TCCAAAAGTA GAAAAACCAG CTGTGATTGC AAATCTCCAT 660 

GATAAGATCC GTGAAAAAAT GGGATTTGCC CATCTGGCTG ACTAGGAGGA AGGAAATGTC 720 

40 

TGAAAAATTA GTAGAAATCA AAGATTTAGA AATTTCCTTC GGTGAAGGAA GTAAGAAGTT 780 

TGTCGCGGTT AAAAATGCTA ACTTCTTTAT CAACAAGGGA GAAACTTTCT CGCTTGTAGG 840 

45 TGAGTCCGGT AGTGGGAAAA CAACTATTGG TCGTGCTATC ATCGGTCTAA ATGATACAAG 900 

TAATGGAGAT ATCATTTTTG ATGGTCAAAA GATTAATGGT AAGAAATCGC GTGAACAAGC 960 

TGCGGAATTG ATTCGTCGAA TCCAGATGAT TTTCCAAGAC CCTGCCGCAA GTTTGAATGA 1020 

50 

ACGTGCGACT GTTGATTATA TTATTTCTGA AGGTCTTTAC AATCACCGTT TATTTAAGGA 1080 

TGAAGAAGAA CGTAAAGAGA AAGTTCAAAG TATTATCCGT GAAGTAGGTC TTCTTGCTGA 1140 

55 GCACTTGACT CGTTACCCTC ATGAATTCTC AGGCGGTCAA CGTCAACGTA TCGGTATTGC 1200 

CCGTGCCTTG GTCATGCAAC CAGACTTTGT TATTGCAGAT GAGCCAATTT CAGCCTTGGA 1260 

CGTTTCTGTA CGTGCCCAAG TCTTGAACTT GCTCAAAAAA TTCCAAAAAG AGCTCGGCTT 1320 

GACCTATCTC TTCATCGCCC ATGACTTGTC GGTTGTTCGC TTTATTTCAG ATCGTATCGC 1380 
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AGTTATTTAC AAGGGTGTTA TTGTAGAGGT TGCAGAAACA GAAGAATTGT TTAACAATCC 1440 

AATTCACCCA TATACTCAAG CCTTGCTTTC AGCGGTACCA ATCCCAGATC CAATCTTGGA 1500 

ACGTAAGAAG GTCTTGAAGG TTTACGACCC AAGTCAACAC GACTATGAGA CTGATAAGCC 1560 

ATCTATGGTA GAAATCCGTC CAGGTCACTA TGTTTGGGCG AACCAAGCCG AATTAGCACG 1620 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 984 base pairs 

(B) TYPE: nucleic acid * 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

GTACCCGGGG ATCAGGTTTT ACGGATTCTT GAAGTTCTCT GTGGGCAGGA CCTCTTGCAG 60 

GTAAGAGTAA GAGTGATTCT ACAAGATTTA CTAGAAGCTA GAAAAATGTG GCAAGCTAAT 120 

GTCAGCTTTC AAAATGCCAT GGAATATCTG GTCTTGAAAG AAATATAAAC TCAAAAATGA 180 

ATGATAAAGA AAGGAAAGGG CTGTTTTATG GACAAAAAAG AATTATTTGA CGCGCTGGAT 240 

GATTTTTCCC AACAGTTATT GGTAACCTTG GCCGATGTGG AAGCCATCAA GAAAAATCTC 300 

AAGAGCCTGG TAGAGGAAAA TACAGCTCTT CGTTTGGAAA ATTCTAAGTT GCGAGAACGC 360 

TTGGGTGAGG TGGAAGCAGA TGCTCCTGTC AAGGCCAAGC ATGTTCGTGA AAGTGTCCGT 420 

CGCATTTACC GTGATGGATT TCACGTATGT AATGATTTTT ATGGACAACG TCGAGAGCAG 480 

GACGAGGAAT GTATGTTTTG TGACGAGTTG CTATACAGGG AGTAGGCATG CAGATTCAAA 540 

AAAGTTTTAA GGGGCAGTCT CCCTATGGCA AGCTGTATCT AGTGGCAACG CCGATTGGCA 600 

ATCTAGATGA TATGACCTTT CGAGCTATCC AGACCTTGAA AGAAGTAGAT TGGATTGCTG 660 

CTGAGGATAC GCGCAATACA GGTCTTTTGC TCAAGCATTT TGACATTTCC ACCAAGCAGA 720 

TCAGTTTTCA TGAGCACAAT GCCAAGGAAA AAATTCCTGA TTTGATTGGT TTCTTGAAAG 780 

CAGGGCAAAG TATTGCTCAG GTCTCTGATG CCGGTTTGCC TAGCATTTCA GACCCTGGTC 840 

ATGATTTAGT TAAGGCAGCT ATTGAGGAAG AAATTGCAGT TGTGACAGTT CCAGGTGCCT 900 

CTGCAGGAAT TTCTGCCTTG ATTGCCAGTG GTTTAGCGCC ACAGCCACAT ATCTTTTACG 960 
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GTTTTTTACC GAGAAAATCA GGTC 



984 
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25 
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55 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1554 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

CTAGAGTCGA AAAGACAAGC AGGAGCGTAT TTCCAAAGAA ACCATGGAAA TCTATGCCCC 60 

GCTTGCCCAT CGTTTGGGGA TTTCCAGTGT CAAATGGGAA TTAGAAGACT TGTCTTTCCG 120 

TTATCTCAAT CCAACGGAGT TTTACAAGAT TACCCATATG ATGAAGGAAA AGCGCAGGGA 180 

GCGTGAGGCC TTGGTGGATG AGGTAGTCAC AAAATTAGAG GAGTATACGA CAGAACGTCA 240 

CTTGAAAGGG AAGATTTATG GTCGTCCCAA GCATATTTAC TCAATTTTCC GCAAAATGCA 300 

GGACAAGAGA AAACGGTTTG AGGAAATCTA TGATCTGATT GCTATTCGTT GTATTTTAGA 360 

TACCCAAAGT GATGTTTATG CCATGCTTGG TTACGTGCAT GAATTTTGGA AACCGATGCC 420 

AGGTCGCTTC AAAGACTATA TCGCCAACCG CAAGGCCAAT GGTTATCAGT CTATCCATAC 480 

GACTGTTTAT GGACCAAAAG GGCCGATTGA ATTCCAGATT CGAACCAAGG AAATGCACGA 540 

GGTGGCTGAG TACGGGGTTG CGGCTCACTG GGCTTATAAG AAAGGTATAA AGGGGCAAGT 600 

TAACAGCAAG GAATCAGCTA TTGGAATGAA CTGGATCAAG GAGATGATGG AGCTCCAAGA 660 

CCAGGCTGAT GATGCTAAGG AATTTGTGGA CTCTGTTAAG GAAAACTATT TGGCTGAGGA 720 

GATTACCGTT TTACCCCAGA TGGAGCTGTC CGTTCCTTCC CAAAGATTCA GGACCGATTG 780 

ATTTTGCCTA CGAAATCCAT ACCAAGGTCG GTGAAAAGCA ACTGGTGCCA AGGTCAATGG 840 

CCGCATGGTT CCACTGACAC CCAAGTTAAA GGACAGGGGA TCAGGTTGAA ATTATCGCCA 900 

ACCCGAACTC CTTTGGACCT TAGCCGTGAC TGGCTCAATA TGGTCAAGAC TAGCAAGGCG 960 

CGCAATAAGA TTCGCCAGTT CTTTAAAAAC CAAGATAAGG AATTGTCTGT CAACAAGGGT 1020 

CGTGAGATGC TGATGGCTCA GTTCCAAGAA AATGGCTATG TGGCAAATAA ATTTATGGAC 1080 

AAGCGCCACA TGGATCAAGT TCTGCAAAAG ACCAGTTACA AGACAGAAGA CTCCCTCTTT 1140 

GCGGCCATTG GTTTTGGGGA AATCGGTGCG ATTACCGTCT TTAACCGTCT GACTGAAAAG 1200 
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GAGCGCCGTG AGGAAGAGCG TGCCAAGGCC AAGGCTGAGG CAGAGGAGCT TGTCAAAGGT 1260 

GGCGAGGTCA AGGTTGAAAA TAAAGAAACT CTCAAGGTCA AGCATGAGGG GGGAGTGGTT 1320 

ATTGAAGGTG CTTCTGGTCT CCTAGTGCGG ATTGCTAAGT GTTGTAACCC CGTGCCTGGT 1380 

GACGATATTG TTGGCTACAT TACCAAGGGT CGTGGTGTGG CTATTCACCG TGTGGACTGT 1440 

ATGAACCTGC GTGCCCAAGA AAACTACGAG CAACGTCTCC TTGATGTGGA ATGGGAAGAC 1500 

CAGTACTCTA GCTCAAATAA GGAGTATATG GCCCATATCG ACTCTAGAGG ATCC 1554 
(2) INFORMATION FOR SEQ ID NO: 38: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3190 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
25 (iv) ANTI-SENSE: NO 



30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

CGTCAGTGCT AAAACAGGGG AAATTCTGGC AACAACGCAA CGACCGACCT TTGATGCAGA 60 

TACAAAAGAA GGCATTACAG AGGACTTTTT TGGCGTGATA TCCTTTACCA AAGTAACTAT 120 

35 

GAGCCAGGTT CCACTATGAA AGTGATGATG TTGGCTGCTG CTATTGATAA TAATACCTTT 180 

CCAGGAGGAG AAGTCTTTAA TAGTAGTGAG TTAAAAATTG CAGATGCCAC GATTCGAGAT 240 

40 TGGGACGTTA ATGAAGGATT GACTGGTGGC AGAATGATGT CTTTTTCTCA AGGTTTTGCA 300 

CACTCAAGTA ACGTTGGGAT GACCCTCCTT GAGCAAAAGA TGGGAGATGC TACCTGGCTT 360 

GATTATCTTA ATCGTTTTAA ATTTGGTGTT CCGACCCGTT TCGGTTTGAC GGATGAGTAT 420 

45 

GCTGGTCAGC TTCCTGCGGA TAATATTGTC AACATTGCGC AAAGCTCATT TGGACAAGGG 480 

ATTTCAGTGA CCCAGACGCA AATGATTCGT GCCTTTACAG CTATTGCTAA TGACGGTGTC 540 

50 ATGCTGGAGC CTAAATTTAT TAGTGCCATT TATGATCCAA ATGATCAAAC TGCTCGGAAA 600 

TCTCAAAAAG AAATTGTGGG AAATCCTGTT TCTAAAGATG CAGCTAGTCT AACTCGGACT 660 

AACATGGTTT TGGTAGGGAC GGATCCGGTT TATGGAACCA TGTATAACCA CAGCACAGGC 720 

55 

AAGCCAACTG TAACTGTTCC TGGGCAAAAT GTAGCCCTCA AGTCTGGTAC GGCTCAGATT 780 

GCTGACGAGA AAAATGGTGG TTATCTAGTC GGGTTAACCG ACTATATTTT CTCGGCTGTT 840 

60 CGATGAGTCC GGCTGAAAAT CCTGGATTTT ATCTTGTATG TGACGGTCCA ACAACCTGGA 900 
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ACATTATTCA GGTATTCAGT TGGGAGAATT 
TATGAAAGAC TCTCTCAATC TTCAAACAAC 
5 AAGTCCTTAT CCTATGCCCA GTGTCAAGGA 
GCGTCGCAAT CTTGTACAAC CCATCGTTGT 
TGCTGAAGAA GGGAAGAATC TTGCCCCGAA 

10 

AGAGGAGGTT CCAGATATGT ATGGTTGGAC 
GCTCAATATA GAACTTGAAT TTCAAGGCTC 
15 TGCTAACACA GCTATCAAGG ACATTAAAAA 
TTATTTCCAT CAGTGCTGGA ATTGTGACAT 
TTATCCAATT TTATAGAAAG GCGCAAATTA 

20 

AGCATCAGGC AAAAGCTGGG ATTCCTACAA 
TTTTGGTTGC TTTCTTTTTC GCCCTATTTA 
25 TTTTGTTCAT CTTGGTCTTG TATGGCTTGG 
TTCGTAAAAT CAATGAGGGG CTTAATCCTA 
GAGTTATCTT CTATCTTTTC TATGAGCGCG 

30 

CAGTTCATTT GGGATTTTTC TATATTTTCT 
ACGCAGTAAA CTTGACAGAC GGTGTTGTAC 
35 TTGTTTGCCT ATGGAGTTAT TGCCTATGTG 
CTTGCCATGA TTGGTGGTTT GCTCGGTTTC 
TTTATGGGTG ATGTGGGAAG TTTGGCCCTA 

40 

CTCCACCAGG AATGGACTCT CTTGATTATC 
GTTATGATGC AAGTCAGTTA TTTCAAACTG 
45 CCTGTACATC ACCATTTTGA GCTTGGGGGA 
TGGAAGGTTG ACTTCTTCTT TTGGGGAGTT 
ATTTTGTATT TGATG7AAGA ATGGCACCCT 

50 

CACAATGAAA ATCAAAGAAC AAACTAGAAA 
ATTGAAACTA GAATAGTACA CCTCTACTTC 
55 CCTGAACGAT TTATCCTGTT CTTATTTCAT 
GGCGAAGCTG ATGTGGTTTG AAGAGATTTT 
TGACGATAGC AAGAACTACC CTACTCGATA 

60 

ATTTTAAGCA TTTGACAAAT CTAGCAACAA 
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TGCCAATCCT ATCTTGGAGC GGGCTTCAGC 960 

AGCTAAGGCT TTGGAGCAAG TAAGTCAACA 1020 

TATTTCACCT GGTGATTTAG CAGAAGAATT 1080 

GGGAACAGGA AC GAAGATTA AAAACAGTTC 1140 

CCAGCAAGTC CTTATCTTAT CTGATAAAGC 1200 

AAAGGAGACT GCTGAGACCC TTGCTAAGTG 1260 

GGGCTCTACT GTGCAGAAGC AAGATGTTCG 1320 

AATTACATTA ACTTTAGGAG ACTAATATGT 1380 

TTTTACTAAC TTTAGTAGGA ATTCCGGCCT 1440 

CAGGCCAGCA GAT G CAT GAG GATGTCAAAC 1500 

TGGGAGGTTT GGTTTTCTTG ATTACTTCTG 1560 

GTAGCCAATT CAGCAATAAT GTGGGAATGA 1620 

TCGGATTTTT AGATGACTTT CTCAAGGTCT 1680 

AGCAAAAATT AGCTCTTCAG CTTCTAGGTG 1740 

GTGGCGATAT CCTGTCTGTC TTTGGTTATC 1800 

TCGCTCTTTT CTGGCTAGTC GGTTTTTCAA 1860 

GGTTTAGCTA GTATTTCCGT TGTGATTAGT 1920 

CAAGGTCAGA TGGATATTCT TCTAGTGATT 1980 

TTCATCTTTA ACCATAAGCC TGCCAAGGTC 2040 

GGTGGGATGC TGGCAGCTAT CTCTATGGCT 2100 

GGAATTGTGT ATGTTTTTGA AACAACTTCT 2160 

ACAGGTGGTA AACGTATTTT CCGTATGACG 2220 

TTGTCTGGTA AAGGAAATCC TTGGAGCGAG 2280 

GGGCTTCTAG CAAGTCTCCT GACCCTCGCA 2340 

GATGTTTCAG GGTGTTTTTG TGTTTAAATA 2400 

GCTAACTTTA GGCTGCTCAA AATATAATAT 2460 

TAAAACATTG TTAGAAATCG ATTTGACTGT 2520 

TTTACTATAC AGTTTCGAGG TTGTAGATAA 2580 

CTGAAAAGTG TTAACACCTA CAGACAAGCC 2640 

GGTATCGGCT TTTGCTTTCT GAAAAAAATT 2700 

AAAATTCTAT AAATATAATA GATTGAAACT 2760 
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AGAATAGTAC 


ACATCTACTT 


CTAAAACATT 


GTTAGAAATC 


GATTTGACTG 


TCCTGATCGA 


2820 


TTTGTCCTGT 


TCTTGTTTCA 


TTTTACTATA 


TTTCTATGAT 


AAAACGCATA 


GTATCAAGTT 


2880 


TTCTTAATCC 


CCTGATACTA 


TGCGTGTTTG 


TAATTTTTAA 


GATTTTGTGC 


TTAGAGTCGA 


2940 


CTCCTTATTT 


TAGATATTTA 


AAAGGAATCT 


CACTTCCACA 


GAGCCAGTTG 


TAGACTTGGT 


3000 


CATTAACAAA 


TACATTCATG 


GCTTCGTGAG 


CATACTCAGG 


CAT GAT AC GA 


TAGGTTTTAT 


3060 


CGCAGGTCAG 


ACGATTATAA 


ATCGCAAACT 


GGGTAATGGG 


ATAGCAAACA 


TCGTCGTCCA 


3120 


AGCCCGTAAT 


CATCTTAACC 


TCACCTTGGA 


TACGATGGGC 


AAGATTTTTG 


ACATCGACTC 


3180 


TAGAGGATCC 












3190 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5992 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

TTGTTCTTAG TGTTCCGACA AAGATTCTTC AAAATCAAAT CATGGAAGAA GAAGGTAAAC 60 

GTCTCAAGGA AGTGTTCCAT ACAGATATTC ATAGCTTAAA GGGACCACAA AATTATCTGA 120 

AGTTGGATGC CTTTTATCAT TCTTGCAGGA AAATGATGAA AATCGCTTAT TTAGACGCTT 180 

TAAAATGCAA GTCTTGGTCT GGCTTACTGA GACAGAGACA GGAGATTTGG ATGAAATCGG 240 

GCAACTCTAC CGTTACCAAC ATTTTCTAGC AGACCTTCGT CATAATGGGA ATTTATCATC 300 

CCAGAGCTTA TTTGTGACGG AAGATTTTTG GAAACGTAGT CAAGAAAGGG CAGAGACTTG 360 

CAAGCTTTTA GTGACTAATC .ATGCCTATCT CGTAACCAGA CTTGAAGATA ATCCTGAATT 420 

TGTCAGTGAC CGTTTACTGA TTATTGATGA AGTCCAAAAG ATTTTGTTAG CTCTAGAAAA 480 

TCTGCTTCAA GAGACCTACG ATATACAATC TATTATCGAT TTAATTGATA AGGCTTTAGT 540 

AGGAGAAGAA AACAGGGTTC AACAACGGAT ACTAGAAAGT ATTCGCTTTG AGTGCCTCTA 600 

CTTGATAGAA CAATTTCAGT CTGGCAAATC TAGGAAAAAT ATCTTAGATT CTCTGGACAA 660 

TCTCCATCAG TATTTTTCAG AATTAGAAGT GGAAGGCTTT GATGAGCTGG TTCGCTATTT 720 

TACAGCTGAA GGTGATTACT GGCTTGAAGT AACTGAAACG AGTCAAAAGA AAATTCAGAT 780 
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TTCTTCTACA AAATCAGGCC GTACTCTTCT GTCCTCTTTA CTTCCTGAGA GTTGCCAAGT 840 

CTTGGGAGTA TCGGCTACTC TTGAGATTAG TCAGAGGGTT TCTTTGGCAG ACCTTTTAGG 900 

5 

CTATCCTGAA GCCAAATTTG TCAAGATTGA ATCTCGGGGA AAACAGGAAC AAGAAGTGGT 960 

TATGGTCAAA GATTTCCCTC TGGTAACAGA AACCTCCTTA GAAGTCTATG CCAGAGAGGT 1020 

10 AGCTGCTTTA CTAGTGGAAA TTCAAGCTTT CCAGCAACCG ATTTTGGTTC TCTTTACCGC 1080 

TAAAGACATG CTTCTAGCAG TATCGGATTT ACTTACAGTT AGCCACTTGG CCCAGTATAA 1140 

AAATGGGGAT GTTCATCAGC TAAAGAAACG CTTTGAAAAA GGTGAACAAC AAATCTTGCT 1200 

15 

TGGTGCAGCA AGTTTCTGGG AGGGAGTTGA TTTTTCAAGC CATCCTTTTG TGATTCAAGT 1260 

TGTACCGAGG CTTCCTTTCC AAAATCCTCA AGAACCCTTG ACGAAAAAGA TTAATCAAGA 1320 

20 ACTGAATCAA GAAGGGAAAA ATGCCTTTTA TGATTATCAA TTGCCAATGG CCATTATTCG 1380 

TTTAAAACAG GCTTTGGGAA GAAGTATGAG ACGTGAATAC CAACGTTCCT TAACTCTTAT 1440 

TTTGGATAGG AGAATCATCG GAAAACGATA CGGCAAACAA ATAGTAGCAT CTCTAGCAGA 1500 

25 

AGAAGCGACT GTTAAAACCA TCTCTCGATC CGAAGTTGAC GAGGCTATTG ATAGATTTTT 1560 

TAATGAACTT TGATAAATAG TATTGTATGA AAGTATAAGG TTAGTACATA TGAAACGTTC 1620 

30 TCTCGACTCT AGAGTCGATT ATAGTTTGCT CTTGCCAGTA TTTTTTCTAC TGGTCATCGG 1680 

TGTGGTGGCT ATCTATATAG CCGTTAGTCA TGATTATCCC AATAATATTC TGCCCATTTT 1740 

AGGGCAGCAG GTCGCCTGGA TTGCCTTGGG GCTTGTGATT GGTTTTGTGG TCATGCTCTT 1800 

35 

TAATACAGAA TTTCTTTGGA AGGTGACCCC CTTTCTATAT ATTTTAGGCT TGGGACTTAT 1860 

GATCTTGCCG ATTGTATTTT ATAATCCAAG CTTAGTTGCA TCAACGGGTG CCAAAAACTG 1920 

40 GGTATCAATA AATGGAATTA CCCTATTTCA ACCGTCAGAA TTTATGAAGA TATCCTATAT 1980 

CCTCATGTTG GCTCGTGTCA TTGTCCAATT TACAAAGAAA CATAAGGAAT GGAGACGCAC 2040 

GGTTCCGCTG GACTTTTTGT TAATTTTCTG GATGATTCTC TTTACCATTC CAGTCCTAGT 2100 

45 

TCTTTTAGCA CTTCAAAGTG ACTTGGGGAC GGCTTTGGTT TTTGTAGCCA TTTTCTCAGG 2160 

AATCGTTTTA TTATCAGGGG TTTCTTGGAA AATTATTATC CCAGTATTTG TGACTGCTGT 2220 

50 AACAGGAGTT GCTGGTTTCT TAGCTATCTT TATTAGCAAG GACGGACGAG CTTTTCTTCA 2280 

CCAGATTGGA ATGCCGACCT ACCAAATCAA TCGGATTTTG GCTTGGCTCA ATCCCTTTGA 2340 

GTTTGCCCAA ACAACGACTT ACCAGCAGGC TCAAGGGCAG ATTGCCATTG GGAGTGGTGG 2400 

55 

CTTATTTGGT CAGGGATTTA ATGCTTCGAA TCTGCTTATC CCAGTTCGAG AGTCAGATAT 2460 

GATTTTTACG GTTATTGCAG AAGATTTTGG CTTTATTGGC TCTGTCCTGG TTATTGCCCT 2520 

60 CTATCTCATG TTGATTTACC GTATGTTGAA GATTACTCTT AAATCAAATA ACCAGTTCTA 2580 
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CACTTATATT TCCACAGGTT TGATTATGAT 
TGCTGTGACT GGACTACTTC CTTTGACGGG 
5 ATCAGCGATT ATCAGTAATC TGATTGGTGT 
TAATCTAGCT GAAGAAAAGA GCGGAAAAGT 
ACAAATTAAA TAAGGAGAAA ATCATGGTAA 

10 

AAGAAATTGA AGCCTTGACA GTTGTAGATG 
TGGTTGGTTT TGAAGAGCAA GTAACGGGTT 
15 TCTTTGATGG AGATTTATCA GACTATGATA 
CTGCACATTT ACGTGATAAT CAGACCTTGA 
GGAAGAAACT AGCAGCCATT TGTGCGGCAC 

20 

AAAATAAGCG ATACACTTGT TATGACGGCG 
TCAAGGAAAC AGTAGTGGTA GATGGTCAGT 
25 TTGCCTTTGC CTACGAGTTG GTGGAGCAAC 
GAATGCTCTA TCGAGATGTC TTTGGGTAAA 
TTTTATGTGG AAAACTCAGG GAAATCATCG 

30 

GGTATGAAAT ATCACGATTA CATCTGGGAT 
ACTTCAACAG CTGCATTTGT TGAAACATTG 
35 AGTGTCTATC AAGCTTTAAA GGTTTCTACT 
TTAGAGAATT TTTTAGAAAA GTACAAGGAA 
TTATTTGAAG GAGTTTCTGA CCTATTGGAA 

40 

TTGGTCTCTC ATCGAAATGA TCAGGTTTTG 
TATTTTACAG AAGTGGTGAC TTCTAGCTCA 
45 ATGCTTTATT TAAGAGAAAA GTATCAGATT 
ATTGATATCG AAGCAGGTCA AGCTGCAGGA 
AATTTAAGAC AAGTATTAGA CATATAAGAA 

50 

ATCTGCAGGC ACAGGATTAT GATGCCAGTC 
TTCGTATGCG TCCAGGGATG TACATTGGAT 
55 TCTGGGAAAT TGTTGATAAC TCAATTGACG 
AAGTTTTTAT TGAGCCAGAT GATTCGATTA 
TCGATATTCA GGAAAAAACA GGTCGTCCTG 

60 

CTGGAGGAAA GTTCGGCGGT GGTGGATACA 
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/■•i i'ij /■■ill / "T"P^ 
Gil GL. ItUL 


P APATPTTTfi 


AflAATATPGG 


2640 


GATTCCL. 1 1 G 


1 1 Urvl 1 X 




2700 


1 bul I I bUl 1 


TT ATPfZATfSA 


(STTAPPARAP 


2760 


CC GATT CAAA 


r* rr a a A a a dfz 


T'TflT* A TT* A A A 


2820 


AAGTAGCAGT 




PAfZflflPTTTf; 


2880 


TCTTGCGTCG 


AGCCAATATC 


APATflTflATA 




CGCATGCAAT 


CCAAGTAAGA 


rr* A A TP ATfl 


^000 


TGATTGTTCT 


TCCTGGAGGT 


AI GGUlGGi 1 




TTCAAGAATT 


GCAAAGCTTC 


GAGCAAGAAG 




CAATTGCCCT 


CAATCAAGCA 


GAgATATT GA 


JIOU 


TTCAAGAGCA AATCCTTGAT 


GG I UAL. 1 AuG 


**?4 0 


TGACAACCAG 


TCGGGGTCCT 






TAGGAGGGGA 


CGCAGAGAGT 






AATCAGTAAA ACGGGAGTTA 


TTCTCTCGTT 




CTTTTTTCAT 


AAAAAAATGC 


TATAATGAAG 


?4 80 


TTAGGTGGAA 


CTTTACTGGA 


TAATTATGAA 




GCACTGTATG 


GTATCACACA 


AGACCATGAC 


JOVU 


CCTTTTGCGA 


TTGAGACATT 


CGCTCCCAAT 


JuQU 


AATGAAGCCA 


GAGAGCTTGA ACACCCGATT 


?*790 


GACATTTTAA ATCAAGGTGG 


CCGTCATTTT 




GAAATTTTAG 


AAAAAACCTC 


TATAGCAGCT 


3840 


GGCTTTAAGA 


GAAAGCCAAA 


TCCCGAATCC 


3900 


AGCTCTGGTC 


TTGTCATTGG 


TGATCGGCCG 


3960 


CTTGATACCC 


ACTTGTTTAC 


CAGTATCGTG 


4020 


AAAGGAATAA 


GATGACAGAA 


GAAATCAAAA 


4080 


AAATTCAAGT 


TTTAGAGGGC 


TTAGAGGCTG 


4140 


CAACCTCAAA AGAAGGTCTT, CACCATCTAG 


4200 


AGGCCTTGGC AGGATTTGCC AGCCATATTC 


4260 


CTGTTGTGGA TGATGGGCGT 


GGTATCCCAG 


4320 


CTGTTGAGAC 


CGTCTTTACA 


GTCCTTCACG 


4380 


AGGTTTCAGG 


TGGTCTTCAC 


GGGGTGGGGT 


4440 
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CGTCAGTTGT TAATGCCCTT TCCACTCAAT TAGACGTTCA TGTCCATAAA AACGGTAAGA 4500 

TTCATTACCA AGAATACCGT CGTGGTCATG TTGTCGCAGA TCTTGAAATA GTTGGAGATA 4560 

5 

CGGATAAAAC AGGAACAACT GTTCACTTCA CACCGGACCC AAAAATCTTC ACTGAAACAA 4620 



CAATCTTTGA TTTTGATAAA TTAAATAAAC GGATTCAAGA GTTGGCCTTT CTAAATCGCG 4680 



10 GTCTTCAAAT TTCTATCACT GATAAGCGCC AAGGTTTGGA ACAAACCAAG CATTATCATT 4740 



ATGAAGGTGG GATTGCTAGT TACGTTGAAT ATATCAACGA GAACAAGGAT GTAATCTTTG 4800 * 



ATACACCAAT CTATACAGAC GGTGAGATGG ATGATATCAC AGTTGAGGTA GCCATGCAGT 4860 

15 

ACACAACGGG TTACCATGAA AAATGTCATG AGTTTCGCCA ATAATATTCA TACACATGAA 4 920 



GGTGGAACGC ATGAACAAGG TTTCCGTACA GCCTTGACAC GTGTTATCAA CGATTATGCT 4980 
20 CGTAAGAATA AGTTACTGAA AGACAATGAA GACAATCTAA CAGGGGAAGA TGTTCGCGAA 5040 



GGCTTAACTG CAGTTATCTC AGTTAAACAC CCAAATCCAC AGTTTGAAGG ACAAACGAAG 5100 



ACCAAATTGG GAAATAGCGA AGTGGTCAAG ATTACCAATC GCCTCTTCAG TGAAGCCTTC 5160 

25 

TCCGATTTCC TCATGGAAAA TCCACAGATT GCCAAACGTA TCGTAGAAAA AGGAATTTTG 5220 



GCTGCCAAGG CTCGTGTGGC TGCCAAGCGT GCGCGTGAAG TCACACGTAA AAAATCTGGT 5280 
30 TTGGAAATTT CCAACCTTCC AGGGAAACTA GCAGACTGTT CTTCTAATAA CCCTGCTGAA 5340 
ACAGAACTCT TCATCGTCGA AGGAGACTCA GCTGGTGGAT CAGCCAAATC TGGTCGTAAC 5400 



CGTGAGTTTC AGGCTATCCT TCCAATTCGC GGTAAGATTT TGAACGTTGA AAAAGCAAGT 5460 

35 

ATGGATAAGA TTCTAGCTAA CGAAGAAATT CGTAGTCTTT TCACAGCCAT GGGAACAGGA 5520 



TTTGGCGCAG AATTTGATGT TTCGAAAGCC CGTTACCAAA AACTCGTTTT GATGACCGAT 5580 
40 GCCGATGTCG ATGGAGCCCA CATTCGTACC CTTCTTTTAA CCTTGATTTA TCGTTATATG 5640 



AAACCAATCC TAGAAGCTGG CTATGTTTAT ATTGCCCAAC CACCAATCTA TGGTGTCAAG 5700 



GTTGGAAGCG AGATTAAAGA ATATATCCAG CCGGGTGCAG ATCAAGAAAT CAAACTCCAA 5760 

45 

GAAGCTTTAG CCCGTTATAG TGAAGGTCGT ACCAAACCGA CTATTCAGCG TTATAAGGGG 5820 



CTAGGTGAAA TGGACGATCA TCAGCTGTGG GAAACAACCA TGGATCCCGA ACATCGCTTG 5880 
50 ATGGCTAGAG TTTCTGTAGA TGATGCTGCA GAAGCAGATA AAATCTTTGA TATGTTGATG 5940 



GGGGATCGAG TAGAGCCTCG TCGTGAGTTT ATCGACTCTA GAGGATCCCC GG 5992 
(2) INFORMATION FOR SEQ ID NO: 40: 

55 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 907 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
60 (D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
5 (iv) ANTI-SENSE: NO 



10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

TACAAAAGTA GGTGGAGAGG CTGATTATTT GGTCTTTCCA CGAAATCGTT TTGAGTTGGC 60 
TCGCGTTGTG AAATTTGCCA ACCAAGAAAA TATCCCTTGG ATGGTTCTTG GCAATGCAAG 120 

15 

CAATATCATC GTTCGTGATG GTGGGATTCG TGGATTTGTC ATCTTGTGTG ACAAGCTCAA 180 
TAACGTTTCT GTTGATGGCT ATACCATTGA AGCAGAAGCT GGGGCTAACT TGATTGAAAC 240 
20 AACTCGCATT GCCCTCCGTC ATAGTTTAAC TGGCTTTGAG TTTGCTTGTG GTATTCCAGG 300 
AAGCGTTGGC GGTGCTGTCT TTATGAATGC GGGTGCCTAT GGTGGCGAGA TTGCTCACAT 360 
CTTGCAGTCT TGTAAGGTCT TGACCAAGGA TGGAGAAATC GAAACCCTGT CTGCTAAAGA 420 

25 

CTTGGCTTTT GGTTACCGCC ATTCAGCTAT TCAGGAGTCT GGTGCAGTTG TCTTGTCAGT 480 
TAAATTTGCC CTAGCTCCAG GAACCCATCA GGTTATCAAG CAGGAAATGG ACCGCTTGAC 540 
3 0 GCACCTACGT GAACTCAAGC AACCTTTGGA ATACCCATCT TGTGGCTCGG TCTTTAAGCG 600 
TCCAGTCGGG CATTTTGCAG GTCAGTTAAT TTCAGAAGCT GGCTTGAAAG GCTATCGTAT 660 
CGGTGGCGTA GAAGTGTCAG AAAAGCATGC AGGATTTATG ATCAATGTCG CAGATGGAAC 720 

35 

GGCCAAAGAC TACGAGGACT TGATCCAATC GGTTATCGAA AAAGTCAAGG AACACTCAGG 780 
TATTACGCTT GAAAGAGAAG TCCGGATCTT GGGTGAAAGC CTATCGGTAG CGAAGATGTA 840 
40 TGCAGGTGGT TTTACTCCCT GCAAGAGGTA GTGGGGACCT GACAGAGCCC CGATCGGTTA 900 
AGCTATG 907 
(2) INFORMATION FOR SEQ ID NO: 41: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2764 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
50 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



55 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
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AGAACCCTTG GATGCAGCCA TTCAGAAGAT 
CTTTAAATCA CGTGAAATGT TGCTAGAATG 

5 

TTTGGCAAAA CTAATCAGCC ATCTTGGAAT 
CGAGGCCAAT GACCTCTCTA TGATTGAATG 
10 TGTTCCTGAA GTAAAGGCAG CCGCAAATGT 
TGTCGCCTGG GCTATCGAAG AATATGTGCT 
CCGTCTATTC GGAAAAAAAG AAGAACCTAA 

15 

AAATCTTGAT TTGTCTGAAG ATGTTGATCC 
GGAAGAAGCA GAGGTTGAAA TTGTTGAACA 
20 CACAGTTGAA GAAAGTCTGG ATTTAGAGCC 
AGAATTTCCA CACTCAGAAG AAGGGAATAC 
TTCTGAAGTT CTTGAACCAG AAAGGCCTCA 

25 

CCGCAGTCTT AAGAAAACTC GTACAGGTTT 
CTTCCGCTCT GTTGACGAAG AATTTTTCGA 
30 TGTTGGTGTC CAAGTCGCTT CTAACTTAAC 
AAATGCCAAG AAACCTGATG CACTTCGTCG 
TGAAAAGGAT GGTAGCTACG ATGAAAGCAT 

35 

CTTTGTTGGT GTGAATGGTG TTGGGAAAAC 
CAAACAAGCT GGTAAGAAGG TCATGCTGGT 
40 AGCTCAGCTA GCTGAATGGG GCCGACGAGT 
AGCTGATCCA GCCAGCGTGG TCTTTGATGG 
TATTCTCATG ATTGATACTG CTGGTCGTCT 

45 

GGAAAAGATT GGTCGTATTA TCAAACGTGT 
GGCACTTGAT GCATCAACAG GTCAAAATGC 
50 CACACCTTTA ACGGGAATTG TTTTGACTAA 
TCTAGCCATT CGTGAAGAAC TCAATATTCC 
CGATGATATT GGAGAGTTTA ACTCAGAAAA 

55 

CTAATCAGAA GCAAAAATCC TGCAAGGCAT 
GACCATCTTG ACGATAGGTG ATATCTGGTT 
60 GTAGGTCAAA GCTGGCTTGA GGTCCCATGC 



TTCTCCAGAA TTGTTTGACC AATATGAAAT 60 

GTCACCAAAG AATGTTCATA AAGCAACAGG 120 

CGACCAAAGT CAAGTGATGG CTTGTGGTGA 180 

GGCAGGTCTT GGTGTTGCTA TGCAAAACGC 240 

AGTGACGCCG ATGACCAACG ATGAGGAAGC 300 

AAAGGAGAAC TAAGATATGG GATTGTTTGA 360 

AATCGAAGAA GTTGTAAAAG AAGCTCTGGA 420 

TACCTTCACA GAAGTTGAGG AAGTTTCTCA 480 

AGCTGTGTTC CAAGAAGAGG AAATCCAAGA 540 

AGTTGTAGAA GTTTCTCAAA AAGAAGTCGA 600 

TGAGTTTCTA GAGACTATAG AAGAAAATAA 660 

AGCAGAAGAA ACCGTTCAGG AAAAATATGA 720 

CGGTGCCCGC TTGAATGCCT TCTTTGCTAA 780 

GGAACTGGAA GAACTGCTGA TTATGAGTGA 840 

GGAGGAACTA CGTTACGAAG CCAAGCTTGA 900 

TGTCATCATT GAGAAATTGG TTGAGCTTTA 960 

CCACTTCCAA GATAACTTGA CAGTTATGCT 1020 

AACTTCTATC GGAAAACTAG CCCACCGCTA 1080 

TGCAGCAGAT ACCTTCCGTG CGGGTGCAGT 1140 

AGATGTTCCA GTAGTAACTG GACCTGAAAA 1200 

TATGGAACGT GCCGTGGCTG AAGGTATCGA 1260 

GCAAAATAAG GATAACCTTA TGGCTGAGTT 1320 

TGTGCCAGAA GCACCACATG AAACCTTCTT 1380 

CCTAGTACAG GCCAAAGAAT TTTCGAAAAT 1440 

GATTGATGGA ACTGCTCGAG GAGGTGTGGT 1500 

TGTAAAATTG ATTGGTTTTG GTGAAAAAAT 1560 

CTTTATGAAA GGTCTCTTGG AAGGTTTAAT 1620 

AAACTTGCAG GAAATTTTTT TATTCTAAGC 1680 

GCCAAGTCCA TTTGGCACCG AATTTTTCAA 1740 

TTCCAGCTTT ATAGTCATGA AGTGGGGCAC 1800 




• 



WO 98/26072 



PCT/US97/22578 



-94- 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



CATTTTCAGC CCAGAGCTTT TCAATACGGT CAATCAACTT CCATGACGCA CAAACTTCAT 1860 

CCCAGTGGCT AAAGTTAGTT GAGTTGTTAT TTAGGACATC ATAAATCAAT TTTTCGTATG 1920 

GTTCTGGAGA AGCACCAGTT GCAGTCGCAT CTGTACGGTA ATCAAGTGAG TTAGGAGCCA 1980 

AGTTAAATTC TTCTCCTACT TGCTTCCCAT TTAGGCTAAG AGAGAAGCCT TCTGTTGGTT 2040 

GAATATAGAT GGTCAAAATA TTTGGAGCAA GTGGTTCTCC AAAGATAGAA TCCATTTGTT 2100 

TAAAGACGAT GTTGACATGA GTTCCTTTTT CAGTCAGTCG TTTACCTGTA CGGAAAAAGA 2160 

AAGGAACACC ACGGAATCGA TCGCTGTCTA CAAAGAAGGC ACCAGATGTA AAGGTTTCAG 2220 

TTGTTGATTC TGGATTCACA TTTGGCTCGC TACGATAAGA GATGTATTTC ATGCCATCAA 2280 

TCTTACCAGA GCGGTATTGC CCACGGATAA AGTGTTCTTT GAGTTCTTCA TCAGTTGGAT 2340 

GATAGAGGTT TTTAAAGACC TTAATCTTTT CAGCACGAAT CTCGTCTTTT GTGAAGCTTG 2400 

CTGGTTTGTC CATGGCGAGG AGCGAAAGAA GTTGTAGAGT GTGGTTTTGG ACCATGTCAC 2460 

GGAGGGCACC GGATTGGTCA TAGTAGCCAC CACGTTCTTC TACACCCAAG CTCCGCAAAG 2520 

GTAATTTGAA CATTGTCGAA AAATCCTTGT TCCAAACGTT TTCAAAAATC AAGTTTGCAA 2580 

AGCGAACTGC AAAGATGCTT TGGATCATTT CCTTACCAAG ATAATGGTCG ATACGGAAAA 2640 

TTTGTTCTTC GTCAAATGTT GCTAGGAGTT CGTCATTCAA CTTGTTTGCA GTTGCGTAAT 2700 

CTGTACCAAA TGGTTTTTCA ACGATCAAGC GCTCAAAACC TTTGCCATCG ACTCTAGAGG 2760 

ATCC 2764 
(2) INFORMATION FOR SEQ ID NO:42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3189 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 : 

ACACTGTTTA CGATGATGGT CGTATTGATT ACGTGAAAAA CACTTGGAGA TCTTGTCTGA 60 

TGCGATTGCA GATGGAGCTA ATGTAAAAGG TTACTTCATT TGGTCATTAA TGGATGTCTT 120 

CTCATGGTCA AACGGTTATG AGAAACGTTA TGGTCTCTTC TACGTAGATT TTGAAACTCA 180 

AGAACGTTAT CCTAAGAAAT CAGCTCACTG GTACAAGAAA GTAGCGGAAA CTCAGATTAT 240 




WO 98/26072 

AGACTAGTAG AATTAGTCAT TAGATATAGA 
TTTTATCCAA TCTATTTATG AAAAAGTTTA 
5 ACCGTGTTTG ACGAGTGAAG AATTGAAAGT 
GAATGGATTT GTCATTCAGA TGATGAGCTG 
ATCAATCCTG AAGAATGGGA TACTATCTCC 

10 

TCGTAACCAA TTTCTCAAAA AAGTTAAATC 
TAATCACGTT GCTTATACTC AATGAAAATC 
15 ACACCTGATA CTATGCTTTT TATTGTGGGA 
TGTTACCCAG GCTCTTTCAG TTTATTAAGG 
AAAAAGGATT GAATCACTTA GTTTAGAATC 

20 

ATAAAAAGTA TAAAAATCAA ACTTATTGAA 
ATTAGAATTA TTTAAAGCGA TGCGTTGAGC 
25 AGCTCCGTTT TGAATACCAT TACAGCTAAC 
ATTTTGTAGG GTCAATGTGC CAACAAAAGC 
CAAAATCAAA TCTGTTAATT TTCGTTCGCT 

30 

TACGACGCGG ATATTGTCAA TAGGCAACTC 
TCCAATGAAA ATAGTTTCTC TTTCTTCTAC 
35 TTTTTCTGCC GTTTGGAGGG CTTGTTTTTC 
GTTGGTTCGA AGTTTTTCAG CTCCACCATG 
TTCCTGTAAA TAGCGCCTTG CAGTCATATC 

40 

AACTGTTATG GTTCCTTTAC TATTTACTAT 
GAGCATATTA TCACCTCGTT TCCTACTACT 
45 TACAAAAACA ACAAAATGAA ACAAAAACAA 
ATATTTTTGT TGGGTTATAA CTTTGATGTT 
TGGAGAATTA GTCTAAACCG TAGTTATAGT 

50 

GAAGGTAACC ATTTCCGACT TGAGAGAAGA 
CGTTCATAAC GATTGGGTTG ACATCTTCAG 
55 TGTTCATGAG AGCTTTATTG GCATTGTAGC 
CAACACCGTC ATAAAGACTC TCTGTGTAGC 
GGTCGTACAT CCATTCTTTG AGTTTTTCTT 

60 

GTTGGAATTT GTAACCAATG TAGGTTCCGT 
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AI 1 i TAGTGA 


GTCCAAAAGA 


TGTTCAAAGA 


300 


TATTATAAAT 


TTCGAAAftAT 


GCTCTCAAAT 


360 


CTTbGAAAAT 


GGTAT GT CT C 


GACTGGTAAA 


420 


GAAGAATTTA 


AAAATCTATT 


TTTAAATTTT 


480 


TT T GATTCAG 


ATTTTATGCC 


GTTTCAACAA 


5'40 


TTATATTTAG 


TACTCTGTAA 


AACTCTTATC 


600 


ATAGAAAAAA 


GCATAGTATC 


AGGTGTTGAA 


660 


AGATTTACTT 


TT TTT CTT CT 


GAAATTGAGT 


720 


CTTGATGACT 


TTAATGTGTT 


TAGATAGCTT 


780 


TGAAACAATA 


GTATCAAGAT 


TTGATATATT 


840 


CTTACTATGA 


TCTGCGAGTA AATATTTTTT 


900 


CTCTCCCTCT 


TCCTCGCTAA AAGTAGCTAG 


960 


GAAAGCTTTA 


GAAAATTGGA 


GATTAGAGAG 


1020 


ACCTGTAATA 


TCGCGATAAT 


1 TCCACCTAT 


1080 


TAAAATCAGA AAAACAGGTA 


GACTGTTGGT 


1140 


ACGCGCAAAA 


AACTCTAATG 


TTGTTCCTGG 


1200 


TAGACTGCCT 


GCAAAATGGG 


CTATTTCTTG 


1260 


AATATTTGAT 


CGCTCATTAG 


TCAAAAGGGA 


1320 


CACACGAATC 


AGCAAATCTT 


TATCAGCTAA 


1380 


TGAAACGGCT 


ATTTCGTCCA 


TAATCTGTTT 


1440 


CTCTAAAATT 


TTGGCTAATT 


TTTCTTGTTT 


1500 


ATCTTACCAT 


AAACAAACTC 


ATCATTCAAA 


1560 


AAATATCGAA 


GTTTGTTTTC 


AAAACTTTCG 


1620 


TCTAGTTTAC 


TTTTTGATGA 


TTGAGAGTGA 


1680 


CATCGTCTTG 


CATGGCTTCA 


ACTTCGCCAA 


1740 


AGTCATGGTT 


GGAAGTTCCT 


GTTGAAATAC 


1800 


CTGAATCTGG 


GAAAAGTGGA 


TCTTGTCCCA 


1860 


GAAGGAAGGT 


TTTAACCTCT 


TCAGTCCAAC 


1920 


CTTCTTCATT 


TTCATAAAGA 


GTATAGAGTA 


1980 


GCTCTTCTTC 


AGGCAATTCA 


TTGAAACCAA 


2040 


GAACAGACTC 


GTCACGAATA 


ATCAATTTAA 


2100 
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TGATTTCTGC AACGTTGGCT AGTTTGTTGT TACCGAGATA GTAGAGGGGA GTGAAGAAAC 2160 

CAGAGTAGAA GAGGAAGGTT TCGAGGAAGA CGCTGGCAAC TTTCTTTTCA AGTGGGCTGC 2220 

5 

CGTTTAGGTA GATTTCGTTG ACAATCTCAG CCTTCTTTTG TAGGTAAGGA TTGGTATTGG 2280 

TCCATTCGAA AATTTCTTCA ATCTCAGCCT TAGTATTCAA GGTAGAAAAG ATTGATGAGT 2340 

10 AAGATTTAGC GTGGACAGAT TCCATAAATT GGATGTTATT GAAAACAGCT TCCTCATGTG 2400 

GTGTACGGAT GTCTGCGCGA AGGGCTTGAA CCCCAGTTTC AGATTGCATA GTGTCAAGAA 2460 

GGGTTAAACC ACCAAAAACT TTTCCGACCA AGTCTTTCTC TTTGTTAGAT AGCTTTCTCC 2520 

15 

AGTCATCCAA GTCGTTTGAT AAGGGAATAC GTGTATCGAG CCAAAATTGC TCCGTCAGTT 2580 

TTTCCCAAGT TGATTTGTCG ATGACATCTT CGATGGCATT CCAGTTAATG GCTTTGTAGT 2640 

20 AAGTTTCCAT TTAAAATCTC TTTCTGTGTT TAGTATTGCG AACTCACAAT TATTTCTACT 2700 

TTACCATAAT TCTATAGGAG TATCGCACAA AAAGTCGGAA GCCCGACTTT TAAAATGTTA 2760 

CATAAATTAT GTTATGACAT AGTAGATTTG ATTTTATCAG TGCTGCTTAG GGAAAAATAA 2820 

25 

TGTTTCTATG CTAGAAACTA AATCACACAG CTTTCACATT GGTTGGCGCC GACTTCTCCA 2880 

CCGTCATCTG TAAAGGTACG GACGTAGTAG ATAGACTTGA TTCCCTTGTT AAAGGCATAG 2940 

30 TTACGAAGGA TGGACAAGTC ACGTGTCGTT TGTTTATTTT CCCTCTTCCA TTCGTAAAGG 3000 

CCTTTTGGAA TGTCACTACG CATGAAGAGG GTGAGTGAAA GTCCTTGATC CACGTGTTCA 3060 

GTCGCAGCAG CGTAAACATC GATGACTTTA CGCATATCCA TATCGTAGGC AGAAGTGTAG 3120 

35 

TAAGGAATGG TTTCTGTAGA CAAGCCAGCA GCAGGGTAGT AGATTTTACC AATTTTCTTC 3180 

TCTTGGCGT 3189 
40 (2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3580 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
50 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



55 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
TTATTGAAGA AGGTGTTAAA GTTGTCACAA CAGGAGCAGG AAATCCAAGC AAGTATATGG 60 
AACGTTTCCA TGAAGCTGGG ATAATCGTTA TTCCTGTCGT TCCTAGTGTC GCTTTAGCTA 120 
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AACGCATGGA AAAAATCGGT GCAGACGCTG 
ATATCGGTAA ATTAACAACC ATGACCTTGG 

5 

CTGTTATTGC TGCAGGAGGA ATTGCGGATG 
GTGCAGAGGC TGTACAGGTG GGGACACGGT 
10 CAAACTACAA GGAGAAAATT TTAAAAGCAA 
ACTTTGGTCA TGCTGTTCGT GCTATTAAAA 
AAAAAGATGC CTTTAAGCAG GAAGATCCTG 

15 

GTGCCCTAGC CAAAGCAGTT GTTCACGGTG 
AAATCGCAGG GCTTGTTTCT AAAGAAGAAA 
20 ACGGAGCCGC TAAGAAAATT CAAGAAGAAG 
ACTAAAACAG CCTTTTTATT TGCTGGTCAA 
TTCTATGATC AGTATCCGAT TGTTAAAGAA 

25 

TATGATTTGC GTTATCTCAT CGATACGGAA 
CAACCAGCCA TTCTAGCGAC TTCGGTTGCT 
30 CAGCCTGATA TGGTTGCTGG TTTGTCTCTT 
GCCTTGGATT TTGAAGATGC GGTTGCCTTG 
GCGGCTCCTG CTGACTCTGG CAAGATGGTA 

35 

GAAGAAGCCT GTCAAAAAGC TTCTGAACTT 
CCTGCACAAA TCGTCATTGC TGGAGAAGTG 
4 0 CAAGAAGCAG GTGCCAAACG CTTGATTCCT 
CTCCTTGAAC CTGCTAGCCA GAAACTAGCT 
TTTACTTGTC CCCTAGTCGG CAATACAGAA 

45 

CAGCTCTTGA CGCGTCAGGT CAAGGAACCC 
CAAGAAGCAG GCATAAGCAA CTTTATCGAG 
50 GTTAAAAAAA TTGATCAAAC TGCTCACTTA 
GCACTTTTAG AAAAATAGAC TAAAATAAGT 
GAACATAAAA ATATCTTTAT TACAGGTTCG 

55 

AAGTTTGCTC AAGCAGGAGC CAACATTGTC 
TTGCTCGCTG AGTTTTCAAA CTATGGTATC 
60 GATTTTGCAG ACGCTAAGCG TATGATTGAT 



TTATTGCAGA AGGAATGGAA GCTGGGGGGC 180 

TGCGACAGGT AGCCACAGCT GTATCTATTC 240 

GTGAAGGTGC TGCGGCTGGC TTTATGCTAG 300 

TTGTAGTTGC AAAAGAGTCG AATGCCCATC 360 

GGGATATTGA CACTACGATT TCAGCTCAGC 420 

ATCAGTTGAC TAGAGATTTT GAACTGGCTG 480 

ATTTAGAAAT CTTTGAACAA ATGGGAGCAG 540 

ATGTGGAGGG TGGCTCTGTC ATGGCAGGTC 600 

CAGCTGAAGA AATCCTAAAA GATTTGTATT 660 

CCTCTCGCTG GACAGGAGTT GTAAGAAATG 720 

GGTGCCCAGT ATCTAGGGAT GGGACGGGAT 780 

ACGATTGATC GAGCGAGTCA GGTGCTAGGT 840 

GAAGACAAAC TCAATCAGAC CCGCTATACG 900 

ATCTACCGTT TATTGCAAGA AAAGGGCTAT 960 

GGAGAATACT CTGCCTTGGT GGCAAGCGGC i020 

GTAGCTAAGC GTGGAGCCTA TATGGAAGAA 1080 

GCAGTTCTCA ATACGCCAGT AGAGGTCATT 1140 

GGAGTGGTTA CTCCAGCCAA CTATAACACA 1200 

GTTGCAGTTG ATCGAGCGGT TGAACTTTTG 1260 

CTTAAGGTGT CAGGTCCCTT TCACACCTCT 1320 

GAAACTCTGG CTCAGGTAAG TTTTTCAGAT 1380 

GCTGCTGTGA TGCAAAAAGA GGACATTGCT 1440 

GTTCGTTTCT ATGAAAGTAT TGGGGTCATG 1500 

ATTGGACCGG GGAAAGTCTT GTCAGGTTTT 1560 

GCTCATGTGG AAGATCAAGC GAGTTTAGTA 1620 

AGAAGTTTTG AAAGGAAAAA AATGAAACTA 1680 

AGTCGTGGAA TTGGTCTTGC CATCGCCCAC 1740 

TTAAACAGTC GTGGGGCAAT CTCAGAAGAA 1800 

AAGGTGGTTC CCATTTCAGG AGATGTATCA 1860 

CAAGCTATTG CAGAACTGGG TTCAGTAGAT 1920 
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GTTTTGGTCA ACAATGCAGG 


GATTACCCAA 


GATACTCTTA 


TGCTCAAGAT 


GACAGAAGCA 


1980 




GATTTTGAAA AAGTGCTCAA 


GGTCAATCTG 


ACTGGTGCCT 


TTAATATGAC 


ACAATCAGTC 


2040 




TTGAAACCGA 


TGATGAAAGC 


CAGAGAAGGT 


GCTATCATTA ATATGTCTAG 


TGTTGTTGGT 


2100 




TTGATGGGGA ATATTGGTCA AGCTAACTAT 


GCTGCTTCTA 


AGGCTGGCTT 


GATTGGCTTT 


2160 


10 


ACCAAGTCTG 


TGGCACGCGA 


GGTCGCTAGT 


CGGAATATAC 


GAGTCAATGT 


GATTGCTCCA 


2220 


GGAATGATTG 


AGTCTGATAT 


GACAGCTATC 


TTATCAGATA AGATTAAGGA 


AGCTACACTA 


2280 




GCTCAGATTC 


CGATGAAAGA ATTTGGGCAG 


GCAGAGCAGG 


TTGCAGATTT 


GACAGTATTT 


2340 


15 


TTAGCAGGCC 


AAGATTATCT 


AACTGGTCAA 


GTGATTGCCA 


TTGATGGTGG 


CTTAAGTATG 


2400 




TAGCGAAAGC 


TAGAGGTGAA AAGAATGAAA 


CTAAATCGAG 


TAGTGGTAAC 


AGGTTATGGA 


2460 


20 


GTAACATCTC 


CAATCGGAAA 


TACACCAGAA 


GAATTTTGGA ATAGTTTAGC 


AACTGGGAAA 


2520 


ATCGGCATTG 


GTGGCATTAC 


AAAATTTGAT 


CATAGTGACT 


TTGATGTGCA 


TAATGCGGCA 


2580 




GAAATCCAAG 


ATTTTCCGTT 


CGATAAATAC 


TTTGTAAAAA AAGATACCAA 


CCGTTTTGAT 


2640 




AACTATTCTT 


TATATGCCTT 


GTATGCAGCC 


CAAGAGGCTG 


TAAACCAGCC 


AATCTTGATG 


2700 




TAGAGGCTCT 


TAATAGGGAT 


CGTTTTGGTG 


TTATCGTTGC 


ATCTGGTATT 


GGTGGAATCA 


2760 


30 


AGGAAATTGA AGATCAGGTA 


CTTCGCCTTC 


ATGAAAAAGG 


ACCCAAACGT 


GTCAAACCAA 


2820 


TGACTCTTCC 


AAAAGCTTTA 


CCAAATATGG 


CTTCTGGGAA 


TGTAGCCATG 


CGTTTTGGTG 


2880 




CAAACGGTGT 


TTGTAAATCT 


ATCAATACTG 


CCTGCTCTTC 


ATCAAATGAT 


GCGATTGGGG 


2940 


35 


ATGCCTTCCG 


CTCCATTAAG 


TTTGGTTTCC 


AAGATGTGAT 


GTTGGTGGGA 


GGAACAGAAG 


3000 




CTTCTATCAC 


ACCTTTTGCC 


ATCGCTGGTT 


TCCAAGCCTT 


AACAGCTCTC 


TCTACTACAG 


3060 


40 


AGGATCCAAC 


TCGTGCTTCG 


ATCCCATTTG 


ATAAGGATCG 


CAATGGGTTT 


GTTATGGGTG 


3120 


AAGGTTCAGG 


GATGTTGGTT 


CTAGAAAGTC 


TTGAACACGC 


TGAAAAACGT 


GGAGCTACTA 


3180 




TCCTGGCTGA AGTGGTTGGT 


TACGGAAATA 


CTTGTGATGC 


CTACCACATG 


ACTTCTCCAC 


3240 


45 


ATCCAGAAGG 


TCAGGGAGCT 


ATCAAGGCCA 


TCAAACTAGC 


CTTGGAAGAA 


GCTGAGATTT 


3300 




CTCCAGAGCA 


AGTAGCTATG 


TTAATGCTCA 


CGGAACGTCA ACTCCTGCCA ATGAAAAAGG 


3360 


50 


AGAAAGTGGT 


GCTATCGTAG 


CTGTTCTTGG 


TAAGGAAGTA 


CCTGTATCAT 


CAACCAAGTC 


3420 


TTTTACAGGA 


CATTTGCTGG 


GGGCTGCGGG 


TGCAGTAGAG 


CTATCGCACC 


ATCGAGCTAT 


3480 




GCGTCATACT 


TTGTACCATG 


CCAGCTGGGC 


AAGTGAGGTA 


TCAGATATAT 


CGAGCTAATG 


3540 


55 


TCGTTATGGC 


AGGTTTGAGA AGAATTCATA 


CGTATTCAAA 






3580 



(2) INFORMATION FOR SEQ ID NO:44: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 1780 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

5 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

10 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
15 ATGTCCGCAA GAATGTGATT AATCAGCAAT CATCCTTGAT CGGAGATGAA TCGATCTTGG 60 



CTTTTGGAGT GAACCAGCCT TTTAGCGGAT TTGGTGTTAA AGGAGAAAGA CAGCAACAGC 120 



ATCAGCCTAT GACCTCTATG TACTGAAGCG ACCACTTCCC CAAGTAGGAC CTCGATGTCA 180 

20 

TTTTAGATAG TCAAAATCAG GCTGTCTGCA TTGTCGAAAT TACAAAGGTT TCTGTTGAAC 240 



TCTTCAATCA AGTTTCTGCG CAACATGCCT TTAAGGAAGG TGAGGGAGAC AAATCACTTG 300 
25 CCTATTGGCG CCAGGTTCAT GAGGACTTTT TCACAGACTG TTTGGGTGAA GTAGGGCTGA 360 



CTTTTACACC TGAAAGCAAG GTTGTTTTAG AAGAATTTCG CAAGGTCTAC CCACTGTAGA 420 



CTATTAGAAG GAAGAAAGTT TTGGAAATCG CTGTCCAATC CTTTTTTCTC AAGCAAAATA 480 

30 

TGATATAATA AGTTTGTTTG AAGAAGAGCA GCAGCTCTTA AACTTAGAAT AGGAGAAAAC 540 



TATGCAAGCA GTTGAACATT TTATTAAGCA ATTTGTTCCT GAACATTATG ATTTATTTTT 600 
35 AGATTTGAGT CGTGAGACCA AGACTTTTTC TGGGAAAGTG ACCATCACTG GTCAAGCACA 660 



GAGTGACCGC ATCTCCCTCC ACCAAAAAGA CTTGGAAATC ACCTCTGTAG AAGTTGCAGG 720 



TCAAGCTCGT CCATTTACAG TTGACCATGA CAATGAAGCC CTTCATATCG AATTGGCTGA 780 

40 

GGCTGGTCAA GTTGAATTGG TTCTTGCCTT TTCTGGTAAA ATTACAGACA ACATGACAGG 840 



GATTTACCCT TCTTATTATA CAGTTGATGG AGTCAAGAAG GAGGTCTTGT CTACTCAGTT 900 
45 CGAGAGCCAT TTTGCGCGCG AAGCTTTCCC ATGTGTGGAT GAGCCTGAAG CCAAAGCAAC 960 



TTTTGACCTC TCTCTTCGCT TTGACCAAGC AGAAGGTGAA TTGGCCTTGT CAAACATGCC 1020 



AGAAATCGAT GTTGAAAACC GTAAGGAAAC AGGTATCTGG AXGTTTGAGA CAACACCTCG 1080 

50 

CATGTCTTCT TACTTGTTGG CCTTTGTTGC TGGTGATTTG CAAGGGGTGA CCGCTAAAAC 1140 



TAAAAATGGT ACCCTGGTAG GTGTCTACTC AACCAAAGCA CATCCACTTT CAAATCTTGA 1200 
55 TTTCTCACTG GATATCGCTG TTCGCTCTAT CGAGTTTTAC GAAGATTACT ATGGAGTTAA 1260 



GTACCCAATT CCTCAATCTC TCCACATCGC CCTTCCTGAC TTCTCAGCTG GTGCTATGGA 1320 



AAACTGGGGT CTTGTGACCT ACCGTGAAGT TTACTTGGTT GTCGATGAGA ACTCTACATT 1380 

60 

TGCTAGCCGT CAACAAGTTG CCCTTGTTGT GGCCCATGAA TTGGCTCACC AATGGTTTGG 1440 
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GAACCTCGTG ACTATGAAAT GGTGGGATGA CCTTTGGCTC AATGAAAGTT TCGCTAATAT 1500 
GATGGAATAC GTCTGTGTGG ATACCATCGA ACCAAGCTGG AATATCTTTG AAGATTTCCA 1560 

5 

AACAGGTGGA GTACCTCTTG CTCTTGAACG TGACGCTACT GATGGCGTTC AGTCTGTCCA 1620 
CGTCGAAGTT AAACATCCAG ATGAGATCAA TACACTCTTT GACGGCGCTA TCGTCTATGC 1680 
10 AAGGAAGCGT CTCATGCACA TGCTTCGCGT TGCTAGAGAT GCTGATTTGT AAGGTTGCAC 1740 
GCCTACTTTG GAAACACCAT ACAGCACACC ATTGGAGTGA 1780 
(2) INFORMATION FOR SEQ ID NO: 45: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 671 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

25 

(iv) ANTI-SENSE: NO 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
GTCTTTTGTA GCGAGGCCAG TGTCTTTTGC CCATCATTTG TCAGGCAGAT AAAACTAGAG 60 
35 CGTCTATCTT GATGGCAACA CATGCGACTG AGTAGACCGC AATTTTTAGC TTCCAAGCGA 120 
GCCACCATCC TAGAAACTGC GCTCGGGCTC AGATGAAGCT TATCTGGCAG GTCAATCTGG 180 
CGTAGAGATT TTTCTTCAGC CAAGTCCAGA TAGTAGAGCA GGTAGAACTC TTTCAAGGTC 240 

40 

AGACTTTGCT CGCTCTGTTG GGCAATGGTC TCTTCCAAGA GACTTTCAAT TTCTTTCTGA 300 
CGCCGATTGA AGTCAAACCA TTTTTCCAAA TAGGTCATAG TGTCTCCTTT CTTTTTAGAG 360 
45 TCATAAAATA GAAGAAAGTC CATTAACGGG CAGTCTCTGC GTCACAAGAT GATTGCGCAT 420 
GCAATAATTA TACTACTTTT CAAGAATGCT GGCAAGCTCT GTTTTTTAGT GGTTTTCTTT 480 
TTTACTGTCT ATATTTTTGG TAAAAATCAA CTTTTACTTG GATGAAGGTT TTGGCTTCAC 540 

50 

GTAGGAGTTG AAGAAGGGTG GCGCGGGTTT CAATTCTTCT CTTGTCTTGG GCAGACTGCG 600 
GTTCCGGAAG ACTTCCAGAT AACGTTCAAT TTCATCTAGC AATCAGAGCA GGATTGGTCT 660 
55 GGCTCAGTGA C 671 
(2) INFORMATION FOR SEQ ID NO: 4 6: 



(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 1557 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

5 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 

15 TTTCAGCTCA CAAATATAGG TCGGATGAGC CACTTCCTTA CGAACACGCG CATCAAAAGC 60 

ATCTAGCTCC TCACGTGAAA AAGCATCCTG CAAACTATAA AGAGGATACT GATGACTGTA 120 

TTTTTCAAAA CCATCTAAAA CCTTGCCACC AACACGATGA GTCGGACTGT CTGCTAGCAC 180 

20 

TTGCTCTGGA TAAGCAGTTT CTAACTCGAC CAACTCACGG TAAAGGCGGT CATACTCACT 240 

GTCTGAAACC GAGGGATTAT CGCTGGTATA GTACTCAGTC GCATAGCGAT TGAGCAAAGC 300 

25 GACTAACTCA TTCATTCTTT TATTCATAAG ACCATTTTAC CATAAAACAA GCCCTCCTCA 360 

CAAACGAGAA GGGCGGAAAA AACACTTAGT TTGAAATTAT TTTTGAAACT CAAGCAACCT 420 

TATATCAATT TTTCAAAATG AGTTCGAACA TAAATAAACG ATATACAAGA CAAGATGATA 480 

30 

ACACCACTTC CAATTATCAG GAAAGAAGAG AGATGTACAC TTGGCAAGAC TGTCATAAAT 540 

CCTTTTGCAA TAGGCATAAA TAGAATAGCT AAGGTAAAAA TTGTACTCAG TACTCTTCCA 600 

3 5 AGAAATTCGC TCTCAACCTT GGTTTGTACT TGAGTAAAAA AGTGAATATT AAAAATCGTC 660 

ATAAACAATT CACAAACTAA ATTTCCAGAA AAGGAAAGAA AAGTTGGAAG TGGTAATCCC 720 

ATCATAAAAA CTCCGACACC TGTCAAAGCC AGTAAAATCA AAAGATTATA AATATTAGCT 780 

40 

TTAATTTTAC TAGCTAGAAG AGCCCCAATG ATGGAACCAA TAGCCCCCAT AGTTAAAATA 840 

CTTGCATAGG CTCCTTCTGA CCCGTAAAGC TGATTCGAAA AGGGAAGTAG AAATTCAAAA 900 

45 GCTGCAAAAA AGAAATTAAC GCTGGAAGCT ACCAGCAAAA GGAAGAAAAT TTCTTGCTGA 960 

TGCCAGATAT AGTGTAACCC ATCCTTGATA TCTACAAAAA TATCTCTCCC AGTAAAAGCC 1020 

TTTTTCTCTT GAACTTTTGC TTCCTCTTTT GGAAGGAAAG CCACTAGAAC AAAAGCAATG 1080 

50 

AAAAAAGTCA GCGAGTCTAG CAGTAGCGTC ATATGGAGAC TTGCAAACTG TAAAACAAGG 1140 

AAGGAAAGAA CAGGAGAGCT AACACCTACA ACCTGCAAAA CCAGCTCTAA GCGAGAATTA 1200 

55 TAGATCACAA TCTCATTTTT CTCCACCACT TCAGTTATGA TAGCTTTATT GGCTGTGCGA 1260 

GAAAAGGCAA AAGCAATAGC CTGCACAATG TTAGCAACAA TCAAAGCGCC AATCATCCAG 1320 

CTATCATTCC TTATGAAAGA AATAGCCAGA CAAAGAATCC CACAAACAAG ATCTGCCGTC 1380 

60 

ATTAAAATCT TACGACGAGA AAAACGGTCT GAAATAACTC CGCCAAAGGG ATTGACGAGA 1440 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



ATAGATGTGA CGAGCTCAGA AATCTGATAC ATTCCTAAAA CTGTCTGTCC TATAGTCCCC 1500 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 658 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

CTTATTTGGT TTGGGAATTC GTCATGTCGG AAGCAAGGCT AGTCAGCTTT TACTTCAATA 60 

TTTCCATTCA ATTGAAAATC TGTATCAGGC AGATTCAGAG GAAGTGGCTA GTATTGAAAG 120 

TCTAGGTGGC GTGATTGCCA AAAGTCTTCA GACTTATTTT GCGGCAGAAG GCTCTGAAAT 180 

TCTGCTCAGA GAATTGAAAG AAACTGGGGT CAATCTGGAC TATAAAGGAC AGACGGTAGT 240 

AGCGGATGCG GCCTTGTCAG GTTTGACCGT GGTATTGACA GGAAAATTGG AACGACTCAA 300 

GCGCTCAGAA GCTAAAAGTA AACTCGAAAG TCTGGGTGCC AAAGTGACAG GTAGTGTTTC 360 

TAAAAAGACC GACCTCGTCG TGGTAGGTGC AGACGCTGGA AGTAAACTGC AAAAAGCACA 420 

AGAACTTGGT ATCCAGGTCA GAGATGAGGC ATGGCTAGAA AGTTTGTAAT GGATCGTTTA 480 

AAAACAGAGT TTAGAGAATA TGACTATGTC TGTTAATTGA GACGAGATTG ACAAAAATTT 540 

ATTAGTGAAA TAGGAAACAA AGTAAAAAGG AAAAATAAAA AATGTATACT ACCCTATGCG 600 

CATTCATTAC CATCGTAAGA ATGGAGAATA TGACCTTGCT CCTTTGTAAA AGTCAGGA 658 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2474 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



ATAGAAGCCA ACCAGACACT ATTTCCATAA TCATAGAGCA TATTCCCATT 



TTATTGA 



1557 



WO 98/26072 



PCT/US97/22578 



-103- 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 



5 ACAATCGATC AGACAGTCAA TCGATTTCTA AAATGTTTAG AGTAGAGATG TACCTATTCT 60 



AGTTCAATAT ACTATATAAC TGAAAATTTA GATAAATTAG TTTTGGAAAT GACTAACCAA 120 



AGATATCCAA AGTAGTCTAA AATTGTCTAT ACTTTATGAG TGTTTTAGTT AGGAAAAAGG 180 

10 

CTTGTTGTCT ATAATTGGCG CATTAGTCTA GATTTTATTT ATAGAAAATG TTATAATAGA 240 



CTGTATTTAA AAAATTTTAA GGAGAAATGA CAGAATGTCT GTATCATTTG AAAACAAAGA 300 



15 AACAAACCGT GGTGTCTTGA CTTTCACTAT CTCTCAAGAC CAAATCAAAC CAGAATTGGA 360 



CCGTGTCTTC AAGTCAGTGA AGAAATCTCT TAATGTTCCA GGTTTCCGTA AAGGTCACCT 420 



TCCACGCCCT ATCTTCGACC AAAAATTTGG TGAAGAAGCT CTTTATCAAG ATGCAATGAA 480 

20 

CGCACTTTTG CCAAACGCTT ATGAAGCAGC TGTAAAAGAA GCTGGTCTTG AAGTGGTTGC 540 



CCAACCAAAA ATTGACGTAA CTTCAATGGA AAAAGGTCAA GACTGGGTTA TCACTGCTGA 600 



25 AGTCGTTACA AAACCTGAAG TAAAATTGGG TGACTACAAA AACCTTGAAG TATCAGTTGA 660 



TGTAGAAAAA GAAGTAACTG ACGCTGATGT CGAAGAGCGT ATCGAACGCG AACGCAACAA 720 

/. 

CCTGGCTGAA TTGGTTATCA AGGAAGCTGC TGCTGAAAAC GGCGACACTG TTGTGATCGA 780 

30 

CTTCGTTGGT TCTATCGACG GTGTTGAATT TGACGGTGGA AAAGGTGAAA ACTTCTCACT 840 



TGGACTTGGT TCAGGTCAAT TCATCCCTGG TTTCGAAGAC CAATTGGTAG GTCACTCAGC 900 



35 TGGCGAAACC GTTGATGTTA TCGTAACATT CCCAGAAGAC TACCAAGCAG AAGACCTTGC 960 



AGGTAAAGAA GCTAAATTCG TGACAACTAT CCACGAAGTA AAAGCTAAAG AAGTTCCAGC 1020 



TCTTGACGAT GAACTTGCAA AAGACATTGA TGAAGAAGTT GAAACACTTG CTGACTTGAA 1080 

40 

AGAAAAATAC CGCAAAGAAT TGGCTGCTGC TAAAGAAGAA ACTTACAAAG ATGCAGTTGA 1140 



AGGTGCAGCA ATTGATACAG CTGTAGAAAA CGCTGAAATC GTAGAACTTC CAGAAGAAAT 1200 



45 GATCCATGAA GAAGTTCACC GTTCAGTAAA TGAATTCCTT GGGAACTTGC AACGTCAAGG 1260 



GATCAACCCT GACATGTACT TCCAAATCAC TGGAACTACT CAAGAAGACC TTCACAACCA 1320 



ATACCAAGCA GAAGCTGAGT CACGTACTAA GACTAACCTT GTTATCGAAG CAGTTGCCAA 1380 

50 

AGCTGAAGGA TTTGATGCTT CAGAAGAAGA AATACAAAAA GAAGTTGAGC AATTGGCAGC 1440 



AGACTACAAC ATGGAAGTTG CACAAGTTCA AAACTTGCTT TCAGCTGACA TGTTGAAACA 1500 



55 TGATATCACT ATCAAAAAAG CTGTTGAATT GATCACAAGC ACAGCAACAG TAAAATAATC 1560 



TTAATAAACA GAAAACCCAC CTGAATTGGT GGGTTTTCTG ATGCACTATT TTCCAAAAAT 1620 



CTCTTTGAGG TCTGTGTCTG TAATCCGAAT CATGGCTGGG ATGCGGTCCC AGTTTTCTTC 1680 

60 

GGTTAGGATG TAGGATTGTT CAGAGGCACT TGATGTGACT GTTTCAGAGA CAGCTTGTTG 1740 



WO 98/26072 



PCT/US97/22578 



-104- 



CTTTTCTTCA 


ACATTCTCCA 


GTAGATCACT 


GAAGCGTTCA 


ATCAGATAGG 


TTTTTCGGGC 


1800 


AGTTCCGATG 


TGTTGGGTAG 


CATAGTCGAA 


GGCTTGTAAT 


TCGCCTAGTA AGATGAGTTT 


1860 


GCTTTTGGCA 


CGTGTAATGG 


CTGTGTAGAT 


GAGATTTCGC 


TCCAGCATAC 


GTCGGCTAGC 


1920 


ACTAGTAATC 


GGTAGGATGA 


CAACTGGGAA 


CTCACTTCCC 


TGAGACTTAT 


GAATACTCAT 


1980 


GGCATAGGCC 


AAGCGAATCT 


TGTACCATTC 


GTTACGGGGG 


TAAGAGACTT 


CATTACTATC 


2040 


AAAATCAATG 


ACAATCTCGT 


CTTGTTTCGA 


TTCGGTGTAT 


TTACCAGGAA 


TCAGGTCTGT 


2100 


GATAGCTCCT 


AAATCCCCAT 


TAAAGACATT 


GATTTCAGCA 


TCGTTAACCA 


AATGAATGAC 


2160 


CTTGTCTCTC 


TTACGATAGT 


GACACTGAGG 


AGCTTCAAAA 


CTGAGTTGAT 


CTTTTTGTGG 


2220 


GGGATTGAGC 


AGGTCTTGCA 


TGAGCTGATT 


GATAGCATCA 


ATCCCTGCCG 


TCCCTCGGTA 


2280 


CATAGGAGCC 


AGAACTTGGA 


TATCACGGGC 


GGGAATACCA 


TTTCTGAGGG 


CGGCACCTAA 


2340 


GATTTTTTCA 


ATGGTGGCAG 


GAATATGGCC 


ACTAGCAATT 


TCAAAGTAGG 


AACGGTCAGC 


2400 


TTTTTTTTGG 


GTGAAATCAG 


CTGGCAAGAT 


GCCCTGTCGA ATCTGACTAG 


CTAGGGTGAC 


2460 


GATGGTTGAT 


TCTT 










2474 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 716 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

ATCAAAATTA ACGTATTCTT TTTGAAGTTC AAGAACTTCT TCCATTGTTG AGCATTCTGT 60 

AAGGGCACGG TTTGCGTACT CTTCCATCTT AGCTGTGTCG AGTTTCTTCA TCAAGCTGCG 120 

TGTACGAAGT ACAGATGTTG CTGACATAGA GAACTCATCC AAGCCCATTC CGACAAGAAG 180 

TGGAACAGCT TGTTGGTCAC CAGCCATCTC ACCACACATA CCAGCCCATT TACCTTCAGC 240 

GTGAGCTGCT TTGATCACAT TGTTAATCAA GCGTAGGATT GATGGGTTGT ATGGTTGGTA 300 

AAGGTATGAA ACTTGTTCGT TCATACGGTC TGCTGCCATT GTATATTGGA TCAAGTCGTT 360 

TGTACCAATT GAGAAGAAGT CAACTTCTTT AGCAAATTGG TCTGCAAGCA TAGCCGCTGC 420 

AGGAATCTCG ATCATGATAC CAACTTGAAT GTTATCCGCA ACTGCAACAC CTTCAGCAAG 480 
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AAGGTTTGCT TTTTCTTCAT CAAAGACTGC TTTCGCTGCA CGGAATTCTT TCAAGAGCGC 540 

AACCATTGGG AACATGATAC GCAATTGACC GTGAACAGAC GCACGAAGAA GAGCACGGAT 600 

TTGTGTGCGG AACATAGCAT CTCCAGTCTC AGAGATAGAG ATACGAAGAG CACGGAATCC 660 

AAGGAATGGG TCATTCGTGA GGCATATCGA AGTAAGGAAG TCCTTATCTC CACCGA 716 
10 (2) INFORMATION FOR SEQ ID NO:50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 962 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS: single 

<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
20 Uii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

AGTAACCTAA ATCAATTATG GTGTTATGAG TCTTGGTGTG CCCAAAGTGC TGACGTAACT 60 

30 

ATCTCAGCTG AAGGTGCAGA TGCAGATGGC CTATCGCTGC AATCTCAGAA ACAATGGAAA 120 

AAGAAGGATT GGCATAAGGG AAATGACAGA AATGCTTAAA GGAATCGCAG CATCTGACGG 180 

35 TGTTGCAGTT GCAAAAGCAT ATCTACTCGT TCAGCCGGAT TTGTCATTTG AGACTATTAC 240 

AGTCGAAGAT ACAAACGCAG AAGAAGCTCG CCTTGATGCC GCTCTACAGG CATCACAAGA 300 

CGAGCTTTCT GTTATTCGCG AGAAAGCAGT AGGTACGCTC GGTGAAGAAG CAGCTCAAGT 360 

40 

TTTTGATGCT CACTTAATGG TTCTTGCTGA CCCAGAAATG ATCAGCCAAA TCAAGGAAAC 420 

TATCCGTGCG AAGAAAGTGA ATGCAGAAGC AGGTCTGAAA GAAGTTACAG ATATGTTTAT 480 

45 CACTATCTTT GAAGGCATGG AAGACAACCC ATACATGCAA GAACGCGCAC GGATATCCGC 540 

GACGTGACAA AACGTGTATT GGCAAACCTT CTTGGTAAAA AATTGCCAAA CCCAGCTTCT 600 

^ ATCAATGAAG AAGTGATTGT GATTGCGCAT GACTTGACTC CTTCAGATAC AGCTCAATTG 660 

GACAAAAACT TTGTAAAAGC TTTTGTAACC AACATTGGTG GACGTACAAG CCACTCAGCT 720 

ATCATGGCAC GTACACTTGA AATTGCTGCT GTATTAGGTA CAAACAACAT CACTGAAATC 780 

55 GTTAAAGACG GTGACATCCT TGCTGTTAAC GGGATCACTG GAGAAGTGAT TATCAACCCA 840 

ACAGATGAAC AAGCGGCAGA ATTTAAAGCA GCTGGTGAAG CCTATGCGAA CAAAAAGCTG 900 

AATGGGCACT TTTGAAAGAT GCTCAACAGT GACTGCTGAC GGTAACACTC GAGTTGGCTG 960 

CC 962 



60 
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(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2702 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

GATCGTTTCC GTGGCTTGAT CGGAAGCATG TTTGACGAAT AAAGAGGAAA AATAAATTAT 60 

GACATTTTCA TTTGATACAG CTGCTGCTCA AGGGGCAGTG ATTAAAGTAA TTGGTGTCGG 120 

TGGAGGTGGT GGCAATGCCA TCAACCGTAT GGTCGACGAA GGTGTTACAG GCGTAGAATT 180 

TATCGCAGCA AACACAGATG TACAAGCATT GAGTAGTACA AAAGCTGAGA CTGTTATTCA 240 

GTTGGGACCT AAATTGACTC GTGGTTTGGG TGCAGGAGGT CAACCTGAGG TTGGTCGTAA 300 

AGCCGCTGAA GAAAGCGAAG AAACACTGAC GGAAGCTATT AGTGGTGCCG ATATGGTCTT 360 

CATCACTGCT GGTATGGGAG GAGGCTCTGG AACTGGAGCT GCTCCTGTTA TTGCTCGTAT 420 

CGCCAAAGAT TTAGGTGCGC TTACAGTTGG TGTTGTAACA CGTCCCTTTG GTTTTGAAGG 480 

AAGTAAGCGT GGACAATTTG CTGTAGAAGG AATCAATCAA CTTCGTGAGC ATGTAGACAC 540 

TCTATTGATT ATCTCAAACA ACAATTTGCT TGAAATTGTT GATAAGAAAA CACCGCTTTT 600 

GGAGGCTCTT AGCGAAGCGG ATAACGTTCT TCGTCAAGGT GTTCAAGGGA TTACCGATTT 660 

GATTACCAAT CCAGGATTGA TTAACCTTGA CTTTGCCGAT GTGAAAACGG TAATGGCAAA 720 

CAAAGGGAAT GCTCTTATGG GTATTGGTAT CGGTAGTGGA GAAGAACGTG TGGTAGAAGC 780 

GGCACGTAAG GCAATCTATT CACCACTTCT TGAAACAACT ATTGACGGTG CTGAGGATGT 840 

TATCGTCAAC GTTACTGGTG GTCTTGACTT AACCTTGATT GAGGCAGAAG AGGCTTCACA 900 

AATTGTGAAC CAGGCAGCAG GTCAAGGAGT GAACATCTGG CTCGGTACTT CAATTGATGA 960 

AAGTATGCGT GATGAAATTC GTGTAACAGT TGTCGCAACG GGTGTTCGTC AAGACCGCGT 1020 

AGAAAAGGTT GTGGCTCCAC AAGCTAGATC ACCGCGCCTA GGATAACAAT TTTAGCAATC 1080 

AAGATAAACC AAAACATCAT AACAACAAGA AGAACGGAAC CTAAAATTCG GACATCCACC 1140 

AAATGATGGA CATAGTAATT GAGATAACTA GAGAACAGAG TTAGTACACC TAAAATCACC 1200 

AAGAGAACAA AGGCACTGCC TGGTAGGGTA TAGCTAATTT TCCTGTTAGA TAGATTGGGA 1260 
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AGAAAATAAT AAAGCATGAC CAAGATAGCA AAGAGGAGGG CGTAAATCAG AGGACCTGCC 1320 

AACCCTTGTA AAGCCTGATA GATAATGCCA TCTTTTGTCC AATAATGAGC AAGTAAAGCC 1380 

5 

AAAATCATCT GACCAAATAA GATCAAAAAC AAGGCAAACG CAAAGAGGAA CTGCAAGCCA 1440 

AAACTGACTA GGAGACTTAG CATCTGATGG GAAATAAGTC CACGACTCTT TTCGACGCCA 1500 

10 TAAGCCTTGT TAAAAGCTTT TTGCAAGAAA TTTATAGATT TTGAAAAACT CCATAACGCC 1560 

GATAAAACAG AAAAACTCAA TAAACCTGTT GAAGGTTGCG TCAAAGACTT CTCTGGCTAT 1620 

TTTTTCCACA CCTTCATAGA GGCTTGGGGG CAGGACGTCT TTCATAAAGC CCAGAAATTC 1680 

15 

TCCCACAGGA ATCTGAAAAT AGGGGAGGAT ATTGACCACC ACCAAAAGCA GGGGGAAAAT 1740 

CGAAATCAAC CAATAGTACG CTACTGCGAC ACTGGTCAAA CTCACTATCT GATGCTTGAT 1800 

2 0 AATAATGCAA AAAAGCTTTT AATAAAGGCT TGTCTATCAG CTCTTTCCAC CACTTTTTCA 1860 

TGTCATACTC CTTCATTTAT AATCTTATAC TCAATGAAAA TCAAAGAGCA AACTAGAAAG 1920 

CTAGCCGCAA GCTGCTCAAA ACACTGTTTT GAGGTTGTAG ATAAGACTGA CGAAGTCAGT 1980 

25 

CACATACATA CGGTAAGGCG ACGCTGACGT GGTTTGAAGA GATTTTCGAA GAGTATTAAC 2040 

TAATTTCTTC TTACCAATTC CACCATATCA TACGGTAGGG TATTGGCAGC TTCCTTCAAG 2100 

30 GAATAGTTCT CTAAGTTATT TACATTTTGT CGTAATTTCT TGGCATACTT AGTTGTAATT 2160 

AATCGTTTTT CTTCGTATTC GAAAATCAAC TTGCGCTCCA GATAATAGCC TCTCAGCATT 2220 

TCATTGATAT TGTTGGGTTT GACACGATTG ATAACCCGTT CGACAAAGGC ACCACTGCTG 2280 

35 

ATAATAGTTG TTTCTCGAAG ACGAGACTCC TGCATAAAAC TAATCAAAGA GCGTCTGTAG 2340 

ACTCCCTTCA GGTTTTCCAA ACTTTCAATA ATCATCTCCG TATTGGCAAG ATAGAGCTCT 2400 

40 GCAATTTGGT CATAATCAAG AGCACGGAGA CGGCTTTGCT CCTTGTCCTT CCAGCTACGG 2460 

AAGGTCTTTC CAAGAGTAAA AACTTCATGA AGGAGAAAAC GTAAAATCCT CAAGGAAACA 2520 

AGAAAATAAT AGGTCAGTCT TGAGGCAAGT TTACGATTGA TTCCTTGTTC TATATTTTTC 2580 

45 

AGATAACGTT GGTAAACTCG GTAAGCACGA TTGCTAATGT TCCCCTCTTC ATAGGCCTGT 2640 

TCCAAACCAT CACTTTCAAT ACTAAGAATC AAGAGTTTCA AAGCAGCCCA GTCTTCTTGA 2700 

50 TC 2702 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 6217 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



60 (ii) MOLECULE TYPE: DNA (genomic) 



WO 98/26072 



PCT/US97/22578 



-108- 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



GAATTCCAAG AAGCTAGCCA AGAAAGTCGT GAACGTAGTG 


ATCCGCTAAA 


TAGTTATCTC 


60 


CTTTTGTCAG 


GCTCCTTGAC GAAAGAAAAG CTTGCCGATA AATTAGGAGA TTTGGGTTAT 


120 


AAAGCAAGTG 


CTGACCGAAA GATACCGCCC TATTTTCTTG 


CTTTTCGAAT 


ATTACTAAAT 


180 


CCCCTTATTT 


TAATTAGTTT AGCAATATTT GGCTTATCTT 


TCTTTGCTTT 


AGTGATTATC 


240 


ACTCGGATTA AGGAAATGAG AGCAGCAGGT ATAAAACTCT TTTCTGGTCA 


GACTCTCTTA 


300 


TCCATCATGG 


GGCATTCTTT ATCTACTGAT ATCAAATGGC 


TCCTTCTATC 


AGCCCTCCTT 


360 


TCCTTCCTAG 


GTGGGGGTGT CGTTCTTTTT AGTCAAGGTT 


TGTTTTATCC 


TATCTTGTTA 


420 


GCCACCTATG 


GTTTTGGGAT TAGTTTCTAT CTGTTGTTTT 


TATTGGCGAT 


TTCAATTTTA 


480 


CTAATGCTTC 


TTTATCTAAT GAGTTTGAAT ACAAAGCATT 


AGTTCCCGTT 


ATTAGGGGGA 


540 


GATTCCCCCT 


GAACCCTGAT GAATAACCCA TTGTTTCAGT AGACCTGTTT 


TTTCAGTAGG 


600 


ATACGCTTTA AGACAGGTTG ACGTCTTACC AACGATTGAA AGAACTTGAA ATTTCAGACA 


660 


AGATGGCAGG 


ATAGAGTAGA CTATTATCAC GATTTCTTTT 


GACTTAGGTT 


ATAGAGGTTG 


720 


AGATTCAGAA AATCAGAGCA AGTGGTATGC CTTTACCAAG 


GGAGCAGCGA AGAAGAACAA 


780 


GCTCTTTATG 


TAAAGGATAA TCTGCTCCAT TTTGCCAATC 


CACAAGGAAA 


AAATGAACAG 


840 


GGAGAGACAC 


TGGATACCTA TAGTCCAGAT GCTAATACGC 


TCTATGTTAG 


TCCCAGTTAT 


900 


TTGGACAAGG 


AAAAGGTCGT GGTAGATGCT GAGACCAAAC 


AGAAGTTAGC 


CCATCTCCAA 


960 


AAAGGTGAGT 


TTATCCTCTT GCTCCCAGAA CATTTGCGCT 


CTCGAGAAGC 


AGAACTTAAG 


1020 


AAAGTTTTTG 


AAGAAAGATT GAGTTATTAT GGAAAATCTG 


GTGAGGAGGC 


AAGTGCTCCT 


1080 


TTGGATTATG 


AGATGAAAGC GCACGTTAGT TATCTTTCAA 


TGGGAGAAAA 


GCGGTTTGTT 


1140 


TATAATAACG 


GTGAGAATCC CGTATCTACT CAGTATTTGA 


CTGATCCGAT 


TTTAGTTGTA 


1200 


TTCACGCCGA 


CTTCTACAGG TGATAGTTTT ATATCCTTAT 


CTAGTTGGTC 


TATCAATGCT 


1260 


GGAAAACAAC 


TCTTTATCAA AGGATATGAG AGTGGGCTAG AACTCTTGAA 


GAAAGCTGGA 


1320 


ATTTATGAGC 


AAGTATCCTA TCTTAAAGAA GGAAGAAGTG TTTATCTAAC 


TCGTTATAAT 


1380 


GAAGTTCAAA 


CTGAAACAGC AACTTTAATC TTAGGAGCTA 


TTGTGGGGAT 


AGCTAGTTCC 


1440 


TTGTTACTCT 


TTTATTCTGT CAATCTTCTA TATTTCGAGC 


AATTCCGCCG 


AGATATCTTG 


1500 


ATTAAACGAA TTTCAGGTTT ACGATTTTTT GAAACACATG 


CTCAGTATAT 


GGTTAGTCAA 


1560 
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TTTGCCAGTT TTGTATTTGG TGCTAGTCTC TTTATTTTAA GCAGTCGAGA CTTGGTGATT 1620 

GGCTTGCTCA CTTTATTAGT CTTTCTAGCT AGTGCAGTTT TGACGCTTTA CCGTCAAGCG 1680 

CAGAAAGAAT CTCGTGTTTC TATGACAATT ATGAAAGGAA AATAGGATGA TTGAACTAAA 1740 

GAATATATCT AAAAAATTGG AAGCCGTCAG CTATTTTCAG ATACGAATCT TCATTTTGAA 1800 

GGTGGGAAAA TTTATGCCTT AATCGGTACA AGTGGCTGTG GTAAGACAAC ACTCTCGAAT 1860 

ATGATTGGAC GATTGGCGCC ATATGACAAA GGGCAAATCA TCTATGATGG CACTTCTCTT 1920 

AAGGACATCA AGCCTTCTGT TTTCTTTAGA GATTACTTAG GATACTTATT TCAAGATTTT 1980 

GGCTTAATTG AAAGCCAAAC CGTCAAAGAG AATCTCAATC TGGGTTTAGT TGGTAAAAAG 2040 

TTGAAGGAAA AAGAGAAAAT CTCTTTGATG AAACAAGCTC TAAACCGTGT AAACCTCTCT 2100 

TATTTGGATT TGAAGCAACC TATATTTGAG TTATCAGGAG GAGAAGCACA ACGTGTTGCA 2160 

CTAGCGAAGA TAATTTTAAA GGATCCGCCT TTGATTCTTG CAGATGAACC AACCGCTTCC 2220 

TTAGACCCCA AAAATTCTGA GGAATTACTT TCCATCCTAG AATCTTTAAA AAATCCGAAT 2280 

CGGACCATTA TTATTGCGAC CCACAATCCT CTGATTTGGG AGCAAGTGGA TCAGGTCATT 2340 

CGAGTTACCG ATTTATCACA TAGATGATAT GGTAAGATTC AGTTAGAAGA AAGAGTCACA 2400 

AACACACTTT GTGGCTTTTT TATTTCCATA AAAATGGTAA AATAGTAGGA GTAGAAATGA 2460 

GTTCGAGACA TGAAAGTAAT AGATCAATTT AAAAATAAGA AAGTTCTTGT TTTAGGTTTG 2520 

GCCAAGTCTG GTGAATCTGC AGCTCGTTTG TTGGACAAGC TAGGTGCCAT TGTGACAGTA 2580 

AATGATGGGA AACCTTTCGA GGACAATCCA GCTGCCCAAA GTTTGCTGGA AGAAGGGATC 2640 

AAGGTCATTA CAGGTGGCCA TCCTTTGGAA CTCTTGGATG AAGAGTTTGC CCTTATGGTG 2700 

AAAAATCCAG GTATCCCCTA CAACAATCCC ATGATTGAAA AGGCTTTGGC CAAGAGAATT 2760 

CCAGTCTTGA CTGAGGTGGA ATTGGCTTAT TTGATTTCAG AAGCACCGAT TATTGGTATC 2820 

ACAGGATCGA ACGGTAAGAC AACCACAACG ACTATGATTG GGGAAGTTTT GACTGCTGCT 2880 

GGGCAACATG GTCTTTTATC AGGGAATATC GGCTATCCTG CCAGTCAGGT TGCTCAAATA 2940 

GCATCAGATA AGGACACGCT TGTTATGGAA CTTTCTTCTT TCCAACTCAT GGGTGTTCAA 3000 

GAATTCCATC CAGAGATTGC GGTTATTACC AACCTCATGC CAACTCATAT CGACTACCAT 3060 

GGGTCATTTT CTGAATATGT AGCAGCCAAG TGGAATATCC AGAACAAGAT GACAGCAGCT 3120 

GATTTCCTTG TCTTGAACTT TAATCAAGAC TTGGCAAAAG ACTTGACTTC CAAGACAGAA 3180 

GCCACTGTTG TACCATTTTC AACACTTGAA AAGGTTGATG GAGCTTATCT GGAAGATGGT 3240 

CAACTCTACT TCCGTGGTGA AGTAGTCATG GCAGCGAATG AAATCGGTGT TCCAGGTAGC 3300 

CACAATGTGG AAAATGCCCT TGCGACTATT GCTGTAGCCA AGCTTCGTGA TGTGGACAAT 3360 

CAAACCATCA AGGAAACTCT TTCAGCCTTC GGTGGTGTCA AACACCGTCT CCAGTTTGTG 3420 
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GATGACATCA AGGGTGTTAA ATTCTATAAC 
CAAAAAGCCT TATCAGGATT TGACAACAGC 

5 

CGTGGCAATG AGTTTGACGA ATTGGTGCCA 
CTGGGTCAAT CTGCAGAACG TGTCAAACGG 
10 GAGGCGACAG ATATTGCAGA TGCGACCCGC 
GTGGTTCTTC TTAGTCCTGC CAATGCCAGC 
GGCGACCTCT TTATCGACAC AGTAGCGGAG 

15 

TTTACAGGTG GGGGGACGGT TGGACACGTT 
ATCGAAGATG GTTGGGAAGT CCACTATATC 
20 ATCCTTAAGT CAGGTTTGGA TGTCACTTTC 
TATTTCTCTT GGCAAAATAT GCTGGACGTC 
CTCTTTATCA TGTTGCGACT TCGTCCACAG 

25 

GTACCGCCTG TTATCGCAGC GCGTGTGTCA 
CTGTCTATGG GCTTGGCCAA TAAAATCGCC 
30 TTTGAGCAAG CTTCGAGTTT GTCTAAGGTT 
GATCAAAAAA ATCCAGAACC AGATGAATTG 
TTGCCGACTG TATTGTTTGT TGGCGGTTCT 

35 

ACAGACCATA AGAAAGAACT AACAGAGCGC 
AGTCTGAACG AGTTGAGCCA AAATCTTTTT 
40 CCCTTGATGG AATTGGCTGA TATTGTTGTG 
CTCTTGGCGA TAGCAAAATT GCATGTCATT 
GACCAGATTG AAAATGCAGC TTACTTTGTA 

45 

AGCGATTTGA CCTTGGATAG TTTGGAAGAG 
GATTACCAAG CTAAGATGAA AGCTTCTAAG 
50 TTGTTGAAAA AAGATTTATC ATAAGGAAAG 
AAAGAAACCC TCGAAGAATT GAAAGAGTTA 
CTAAAAAAGA AGGCTGAAGA AGAGGTGGCT 

55 

GCTCGAATGG GAGAAGAATC TGAGAAGTCA 
GACCAGGAAG ATTCAGAATC AGCTAAGGAA 
60 GCTGACAAAG AGAAAGAAGA ACCAGAGTCT 
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GACAGTAAAT 


CAACTAATAT 


CTTGGCTACT 


3480 


AAGGTCGTCT 


TGATTGCAGG 


TGGTTTGGAC 


3540 


GACATTACTG 


GACTCAAGAA 


GAT GGT CATC 


3600 


GCAGCAGACA AGGCTGGTGT 


CGCTTATGTG 


3660 


AAGGCCTATG 


AGCTTGCGAC 


TCAAGGAGAT 


3720 


TGGGATATGT 


ATGCTAACTT 


TGAAGTACGT 


3780 


TTAAAAGAAT 


AAAATATGAA 


AAAAATTGTC 


3840 


ACCCTCAATC 


TTTTGTTAAT 


GCCCAAGTTC 


3900 


GGGGACAAGC 


GTGGTATCGA ACACCAAGAA 


3960 


CACTCCATTG 


CGACTGGGAA ATTGCGTCGC 


4020 


TTCAAAGTTG 


GCTGGGGAAT 


CGTCCAATCG 


4080 


ACCCTTTTTT 


CAAAGGGGGG 


CTTTGTCTCA 


4140 


GGAGTGCCTG 


TCTTTATTCA 


CGAATCTGAC 


4200 


TATAAATTTG 


CGACTAAGAT 


GTATTCAACC 


4260 


GAGCATGTGG 


GAGCAGTGAC 


CAAGGTTTCA 


4320 


GTGGATATTC 


AAACCCACTT 


T AAT CAT AAA 


4380 


GCAGGTGCTC 


GTGTCTTTAA 


CCAATTGGTG 


4440 


TACAATATTA 


TCAATCTAAC 


TGGAGATTCT 


4500 


CGTGTTGACT 


ATGTGACCGA 


TCTCTATCAA 


4560 


ACACGAGGTG 


GTGCCAATAC 


GATTTTTGAG 


4620 


GTGCCGCTTG 


GTCGTGAAGC 


TAGTCGTGGT 


4680 


AAAAAAGGCT 


ATGCAGAAGA 


CCTTCAAGAA 


4740 


AAGCTTTCTC 


ACTTACTAAG 


TCACAAGGAA 


4800 


GAATTGAAAT 


CTCTAGCAGA 


TTTTTATCAA 


4860 


TAAATGTCAA 


AAGATAAGAA 


AAATGAGGAC 


4920 


TCAGAATGGC 


AGAAACGAAA 


CCAAGAATAT 


4980 


CTAGCTGAGG 


AGAAGGAAAA 


GGAAAGACAA 


5040 


GAGGACAAAC 


AGGACCAGGA 


GAGTGAAACA 


5100 


GAGTCTGAAG 


AAAAAGTAGC 


ATCCTCAGAG 


5160 


AAAGAGAAGG AGGAACAGGA TAAAAAGCTT 


5220 
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GCTAAAAAGG CTACAAAGGA AAAACCAGCC AAAGCAAAGA TTCCTGGTAT CCATATCTTG 5280 

CGAGCCTTCA CGATTTTATT TCCAAGTCTG CTTTTATTGA TTGTCTCTGC CTACTTGCTC 534 0 

AGTCCTTATG CGACCATGAA AGATATTCGT GTTGAGGGAA CGGTGCAAAC TACAGCTGAT 5400 

GATATTCGAC AGGCTTCAGG CATTCAGGAT TCGGATTATA CGATTAACCT TCTGCTAGAC 5460 

AAGGCAAAAT ATGAAAAGCA GATTAAGTCT AACTATTGGG TTGAATCAGC TCAACTTGTC 5520 

TATCAATTTC CAACTAAGTT CACTATTAAG GTCAAGGAAT ATGATATTGT GGCCTACTAT 5580 

ATTTCTGGTG AAAATCATTA TCCTATTCTT TCCAGTGGTC AGCTTGAGAC TAGTTCTGTG 5640 

15 AGTCTGAACA GTTTACCAGA AACTTATTTA TCAGTTCTCT TTAATGATAG TGAACAAATC 5700 

AAGGTTTTTG TCTCAGAACT TGCTCAAATT AGCCCAGAAC TCAAGGCGGC TATCCAAAAG 5760 

GTGGAATTAG CCCCAAGCAA GGTGACATCC GATTTAATTC GATTGACCAT GAATGATTCG 5820 

20 

GACGAAGTCT TGGTTCCTCT ATCTGAAATG AGTAAGAAAC TGCCATATTA CAGTAAGATT 5880 

AAGCCACAAT TGTCAGAACC GAGTGTGGTC GACATGGAAG CTGGAATTTA CAGTTACACT 5940 

25 GTGGCGGATA AATTAATTAT GGAGGCTGAG GAAAAAGCCA AACAAGAGGC CAAGGAAGCT 6000 

GAGAAAAAAC AAGAAGAAGA ACAGAAAAAA CAAGAGGAAG AGAGCAATCG AAATCAAACA 6060 

AATCAGCGTT CATCGCGTCG CTAGGTTTAC CTTTTCTCTT ATAGTTCTTT AGTGACCATG 6120 

30 

TTTTTACGTT TAATATTTGA CATTTGTTTT TCTTTATGTT ACATCTGCAA TGTAATCGAT 6180 

TACAAAATAA TTTTTGATGA AGAAGGTAAC ACATATG 6217 
35 (2) INFORMATION FOR SEQ ID NO:53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1491 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
45 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

50 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
CTTGACACTT GATTGCGACT GTTGAATCTT ATCTCTCCAA GAAAAACACG TGAAGATGTT 60 

55 

GAGTCTGCTG TCAGCAAGCT TGAAAGTAGC ACATCTGAGA AACATTGGAT CCATCTGCAG 120 
TTTCTCGTGG GTCTAGCTTG GATCGTGATG ACAATGGTCT TTTGACTCTT GCTGGCGGTA 180 
60 AAATCACAGA CTACCGTAAG ATGGGTGACG AGGCGCTATG GAGCGCGTGG TTGACATCCT 240 
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CAAAGCAGAA TTTGACCGTA GCTTTAAATT GATCAATTCT AAAACTTACC CTGTTTCAGG 300 
TGGAGAATTG AACCCAGCAA ATGTGGATTC AGAAATCGAA GCCTTTGCGC AACTTGGAGT 360 
TTCACGTGGT TTGGATAGCA AGGAAGCTCA TTACCTAGCA AATCTTTACG GTTCAAATGC 420 
ACCGAAGGTC TTTGCACTTG CTCACAGCTT GGAACAAGCG CCAGGACTCA GCTTGGCAGA 480 
TACTTTGTCC CTTCACTATG CAATGCGCAA CGAGTTGGCT CTTAGCCCAG TTGACTTCCT 540 
TCTTCGTCGT ACCAACCATA TGCTCTTTAT GCGTGATAGC TTGGATAGCA TCGTTGAGCC 600 
AGTTTTGGAT GAAATGGGAC GATTCTATGA CTGGACAGAA GAAGAAAAAG CAACTTACCG 660 
15 TGCTGATGTC GAAGCAGCTC TCGCTAACAA CGATTTAGCA GAATTAAAAA ATTAAGAAAA 720 
AATAAAAGAG GTGGAGGGCA GCATTCCTTG TCGCCCGTCC CTTCTTTTTA ATGGAGACAG 780 
AAAGATGATG AATGAATTAT TTGGAGAATT TCTAGGGACT TTAATCCTGA TTCTTCTAGG 840 

20 

AAATGGTGTT GTTGCAGGTG TGGTTCTTCC TAAAACCAAG AGCAATAGCT CAGGTTGGAT 900 
TGTGATTACT ATGGGTTGGG GGATTGCAGT TGCGGTTGCA GTCTTTGTAT CTGGCAAGCT 960 

25 CAGTCCAGCT CATTTAAACC CAGCTGTGAC CATCGGTGTG GCCTTAAAAG GTGGTTTGCC 1020 

TTGGGCTTCC GTTTTGCCTT ATATCTTAGC CCAGTTCGCA GGGGCCATGC TGGGTCAGAT 1080 

TTTGGTTTGG TTGCAATTCA AACCTCACTA TGAGGCAGAA GAAAATGCAG GCAATATCCT 1140 

30 

GGCAACCTTC AGTACTGGAC CAGCCATCAA GGATACTGTA TCAAACTTGA TTAGCGAAAT 1200 

CCTTGGAACC TTTGTTTTGG TGTTGACAAT CTTTGCTTTG GGTCTTTACG ATTTTCAGGC 1260 

35 AGGTATCGGA ACCTTTGCAG TGGGAACTTT GATTGTCGGT ATCGGTCTAT CACTAGGTGG 1320 

GACAACAGGT TATGCCTTGA ACCCAGCTCG TGACCTTGGA CCTCGTATCA TGCACAGCAT 1380 

CTTGCCAATT CCAAACAAGG GAGACGGAGA CTGGTCTTAC GCTTGGATTC CTGTTGTAGG 1440 

40 

CCCTGTTATC GGAGCAGCCT TGGCCGTGCT TGTATTGTCA CTTTTCTAAT C 1491 
(2) INFORMATION FOR SEQ ID NO: 54: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1229 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

50 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
55 (iv) ANTI-SENSE: NO 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
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20 
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ACAACGGATA ATGTCATCGA TCTCTTTGAA CACATCTTTA AGGAATGTTC AACGAAAACA 60 

TTGTGATGGC GGGCAAGGTC AATCTCTTGA ATTTTGCCAA TCTAGCAGCC TATCAGTTCT 120 

TTGACCAACC GCAAAAGGTG GCCTTGGAGA TTCGTGAGGG GTTGCGTGAG GATCAGATGC 180 

AAAATGTTCG TGTTGCAGAC GGTCAAGAGT CCTGTTTAGC TGACCTAGCG GTGATTAGTA 240 

GTAAGTTCCT CATTCCTTAT CGGGGAGTTG GAATTCTAGC CATTATCGGT CCAGTTAATC 300 

TGGATTACCA ACAGCTAATC AATCAAATCA ATGTGGTCAA CCGTGTTTTG ACCATGAAGT 360 

TGACAGATTT TTACCGCTAC CTCAGCAGTA ATCATTACGA AGTACATTAA GATTGAAATC 420 

15 ATTAAAGGAG GCGAACATGG CCCAAGATAT AAAAAATGAA GAAGTAGAAG AAGTTCAAGA 480 

AGAGGAAGTT GTGGAAACAG CTGAAGAAAC AACTCCTGAA AAGTCTGAGT TGGACTTGGC 540 

AAATGAACGT GCAGATGAGT TCGAAAACAA ATATCTTCGC GCTCATGCAG AAATGCAAAA 600 

TATCCAACGC CGTGCCAATG AAGAACGTCA AAACTTGCAA CGTTATCGTA GCCAGGACTT 660 

GGCAAAAGCA ATCTTACCAT CTCTTGACAA CCTTGAGCGT GCACTTGCAG TTGAAGGTTT 720 

25 GACAGATGAT GTGAAGAAGG GCTTGGCGAT GGTGCAAGAA AGCTTGATTC ACGCTTTGAA 780 

AGAAGAAGGA ATTGAAGAAA TCGCAGCAGA TGGCGAATTT GACCATAACT ACCATATGGC 840 

CATCCAAACT CTCCCAGGAG ACGATGAACA CCCAGTAGAT ACCATCGCCC AAGTCTTTCA 900 

AAAAGGCTAC AAACTCCATG ACCGCATCCT ACGCCCAGCA ATGGTAGTGG TGTATAACTA 960 

AGATACAAAC GCTCGTAAAA AGCTCGCAGT AAAAATAGGA GATTGACGAG TGTTCGATGA 1020 

35 ACACAAGAAA ATCTATCTTT TTTACTCAGA GCTTAGGGCG TGTTCGATTC GGCAATTCTG 1080 

ACGGTAGCTA AAGCAACTCG TCAGAAAACG GCAATCGCTA TGACGTTTGC CTAGCTTCCT 1140 

TACTAACTCG TCGTCGAAAT AAAATCGATT TCGACTCCTC GTGTCGCAAT TTACATAATA 1200 

GAAAACTTGT CCGAACGACA TAAACTATG 1229 
(2) INFORMATION FOR SEQ ID NO:55: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5816 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
55 (iv) ANTI-SENSE: NO 



60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



30 



40 
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AAGAAGAAGA CTGTATGGAT AATCGACCAA TTGGTTTTTT GGATTCGGGT GTCGGGGGCT 60 

TGACCGTTGT GCGCGAGCTC ATGCGCCAGC TTCCCCATGA AGAAATCGTC TATATTGGAG 120 

ATTCGGCGCG GGCGCCCTAT GGCCCCCGTC CTGCTGAGCA AATTCGTGAA TATACTTGGC 180 

AGCTGGTCAA CTTTCTCTTG ACCAAGGATG TCAAAATGAT TGTCATTGCT TGTAACACTG 240 

CGACTGCGGT TGTCTGGGAA GAAATCAAGG CTCAACTAGA TATTCCTGTC TTGGGTGTAA 300 

TTTTGCCAGG AGCTTCGGCA GCCATCAAGT CCAGTCAAGG TGGGAAAATC GGAGTGATTG 360 

GAACGCCCAT GACGGTACAA TCAGACATAT ACCGTCAGAA AATCCATGAT CTGGATCCCG 420 

ACTTACAGGT GGAGAGCTTG GCCTGTCCCA AGTTTGCTCC CTTGGTTGAG TCAGGTGCCC 480 

TGTCAACCAG TGTTACCAAG AAGGTGGTCT ATGAAACCCT GCGTCCCTTG GTTGGAAAGG 540 

TGGATAGCCT GATTTTGGGC TGTACTCATT ATCCACTCCT TCGCCCTATT ATCCAAAATG 600 

TGATGGGGCC AAAGGTTCAG CTCATCGATA GTGGGGCAGA GTGCGTACGG GATATTTCAG 660 

TCTTACTCAA TTATTTTGAA ATCAATCGTG GTCGCGATGC TGGACCACTC CATCACCGTT 720 

TTTACACAAC AGCCAGTAGC CAAAGTTTTG CACAAATTGG TGAAGAATGG CTGGAAAAAG 780 

AGATTCATGT GGAGCATGTA GAATTATGAC AAATAAAATT TATGAATATA AGGATGACCA 840 

GAACTGGTAT GTTGGGTCTT ATAGTATTTT TGGTGGCGTT AACAGTTTGA GCGACTATAA 900 

GGCAGATTTT CCTCTGTTTG AATTCTCCAA AATATTTGGA GATGAAGAGT ATGGTTTCCC 960 

GCTTTCAGTT ACTGTTTTAC GCTATGGTTC TACCTACCGT TTGTTCTCCT TTGTGGTAGA 1020 

CATGCTTAAT CAAGAAATGG GACGAAACTT GGAAGTTATT CAACGTCATG GGGCCCTGCT 1080 

CTTGGTTGAA AATGGGCAAC TCTTGTATGT AGAATTGCCT AAAGAAGGGG TCAATGTTCA 1140 

TGATTTCTTT GAGACAAGCA AGGTCAGAGA AACCTTGTTG ATTGCGACTC GTAACGAAGG 1200 

TAAAACCAAG GAATTCCGAG CTATCTTTGA TAAGTTAGGC TACGATGTGG AAAATCTTAA 1260 

TGACTACCCT GACCTGCCTG AAGTAGCAGA AACAGGTATG ACCTTTGAAG AAAATGCCCG 1320 

CCTTAAGGCA GAAACCATTT CTCAATTAAC GGGCAAGATG GTTTTGGCAG ATGATTCTGG 1380 

TCTCAAAGTC GATGTCCTTG GTGGCTTACC AGGCGTCTGG TCAGCTCGTT TCGCAGGTGT 1440 

GGGAGCAACT GACCGTGAAA ATAATGCCAA ACTCTTGCAC GAATTGGCCA TGGTCTTTGA 1500 

ACTCAAGGAC CGCTCGGCTC AGTTCCACAC AACCCTAGTC GTAGCCAGCC CAAATAAGGA 1560 

AAGTTTAGTT GTTGAAGCAG ACTGGTCAGG TTATATTAAC TTTGAACCTA AGGGTGAAAA 1620 

TGGCTTTGGC TATGATCCCC TCTTCCTTGT AGGAGAGACA GGTGAGTCAT CAGCTGAATT 1680 

AACCCTGGAA GAAAAAAATA GTCAATCTCA CCGTGCCTTA GCCGTTAAGA AACTTTTGGA 1740 

GGTATTTCCA TCATGGCAAA GCAAACCATC ATTGTAATGA GCGATTCCCA TGGCGATAGC 1800 

TTGATTGTGG AAGAAGTCCG TGATCGCTAT GTGGGCAAAG TCGATGCCGT TTTTCATAAC 1860 
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GGCGATTCTG AACTACGTCC GGATTCTCCA 
AACATGGACT TCTACGCCGG CTACCCAGAA 

5 

ATTATCCAAA CTCATGGTCA CTTGTTTGAC 
TGGGCTCAGG AGGAAGAGGC CGCTATCTGC 
10 TGGATGGAAG GCAAGATCCT CTTTCTAAAT 
ATCAGAGAAT GTCTCTATGC TCGTGTGGAG 
TTGACACGAG ATCACGAGGT GTATCCAGGT 

15 

AGGAGTTTGA GACTTTCTTG TTGGGGCAGG 
TAGCTGTGTT GATTGATACC CACAATGCGG 
20 CCTATACCCG TGTTCCCGTT GTGACAGATG 
GAGATATTAT GGCTTATCAG ATGGAGCATG 
ATATCGTTCA TATGACAAAA ACGGACGTAG 

25 

AGGTCTTGCA CAAGCTAGTA GATGAGTCCT 
TCCAAGGGAT TATTACGCGC AAGTCCATCC 
3 0 TTAGTAAGGA ATATGAGATT CGATGCCAAT 
AAAGCAGGGC TTGTCTGTCA ATTCCAAGCA 
AGACATGGTA GGTGAGCGGA TTTCTGAGAC 

35 

CAATCTAAAA ATCAGCGCCC AGAAGCGAAA 
TCTCTATCAA AAAGGAGAGG TGGACAGCTT 
40 AAAGAAGACG GAAAAGCCAG AGATTCTATA 
TCCAGAGGGC CGCTTGCTAG CGCTCTTAAT 
TTTAGCCATC AAGGTTGCGG ACATCAATCT 

45 

TTCCCAACAG AGGATTGTCA CCATTCCCAC 
GGGGCAGACC TATCTTTTTG AAAGAGGAGA 
50 TCAGTTAGAA TCTTTTGTCA GGAGAAGGTT 
CAGTTTATTC TAAGCAAATA GAAACAGGTC 
AAAAACAGTC CTGACCTTAG AAAATATAGA 

55 

AGGACCCTGG ACTTGCTCTT TGCATCTGGT 
TGCCCATTAC GGAAGTCATC GAACAGTATC 
60 GTCTGGAAGT GACGGGTGAG TACATGGTCA 



CTTTGGGAGG GCATCCGCGT TGTTAAAGGG 1920 

CGTCTGGTGA CTGAGCTTGG TTCGACCAAG 1980 

ATCAATTTCA ACTTTCAAAA GTTGGACTAC 204 0 

CTCTATGGTC ACTTGCATGT GCCAAGTGCT 2100 

CCAGGTTCTA TCAGTCAACC ACGAGGTACC 2160 

ATTGATGATA GTTACTTCAA AGTGGACTTT 2220 

TTGTCCAAGG AGTTTAGCCG ATGATTGCCA 2280 

AGGAAACTTT TTTGACCCCT GCTAAAAATC 2340 

ATCATGCGAC CCTCTTGCTC AGTCAGATGA 2400 

AAAAACAGTT TGTTGGGACG ATTGGACTCA 2460 

ACTTGAGCCA AGAAATCATG GCGGATACGG 2520 

CGGTTGTTTC GCCTGATTTC ACCATTACGG 2580 

TCTTACCGGT CGTGGATGCA GAGGGTATTT 2640 

TCAAGGCCGT TAATGCCCTC TTGCATGACT 2700 

GAGAGACAGG ATTTCAGCCT TTTTAGAGGA 2760 

GTCCTATAAG TATGATTTGG AGCAATTTTT 2820 

CAGTCTCAAG ATTTACCAAG CCCAGCTAGC 2880 

GATTTCGGCC TGTAACCAAT TTCTATACTT 2940 

TTACCGCTTG GAATTAGCTA AACAAGCTGA 3000 

CCTAGACTCT TTTTGGCAGG AAAGCGACCA 3060 

CCTAGAAATG GGGCTCTTGC CCAGTGAGAT 3120 

GGATTTTCAG GTGCTGCGAA TCAGCAAGGC 3180 

GGCCTTGCTT TCAGAATTGG AACCCTTGAT 3240 

GAAACCCTAT TCTCGTCAGT GGGCCTTTCG 3300 

TCCATCCTTA TCAGCTCAAG TCTTACGTGA 3360 

GATTTGTACG AATTGCAAAA AATTAGGATT 3420 

TAATGGATAT TAAATTAAAA AGATTTTTGA 3480 

TTCTAAGTAC CAAGATGGAT ATCTACGATG 3540 

TAGCCTATGT TTCAACCCTG CAGGCCATGC 3600 

TGGCTAGTCA GCTCATGCTG ATTAAGAGTC 3660 
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GTAAACTCCT TCCGAAGGTA GCAGAAGTGA CAGACTTGGG GGATGACCTG GAGCAGGACC 3720 

TCCTCTCTCA AATCGAAGAA TATCGCAAGT TCAAGCTCTT GGGTGAGCAC TTGGAAGCCA 3780 

AGCACCAAGA ACGGGCCCAG TATTATTCCA AAGCGCCGAC AGAGTTGATT TACGAAGATG 3840 

CGGAGCTTGT GCATGACAAG ACGACCATTG ACCTCTTTTT GGCTTTTTCA AATATCCTAG 3900 

CCAAGAAAAA AGAGGAGTTT GCACAAAATC ACACGACGAT CTTGCGGGAT GAGTATAAGA 3960 

TTGAGGACAT GATGATTATT GTGAAAGAAT CCTTGATTGG ACGAGATCAA TTGCGCTTGC 4020 

AGGATTTGTT CAAGGAAGCC CAGAATGTCC AAGAGGTCAT CACCCTCTTT TTGGCAACCC 4080 

15 TAGAGTTAAT CAAAACCCAG GAGCTGATCC TCGTGCAAGA GGAGAGTTTC GGAGATATCT 4140 

ATCTCATGGA AAAGAAGGAA GAAAGTCAAG TGCCTCAAAG CTAGACTTGA TAGAGAGGAA 4200 

AGATGAGTAC TTTAGCAAAA ATAGAAGCGC TCTTGTTTGT AGCGGGTGAA GATGGGATTC 4260 

20 

GGGTCCGCCA GTTAGCTGAA CTCCTCTCTC TGCCACCGAC AGGCATCCAG CAGAGTTTAG 4320 

GAAAATTAGC CCAGAAGTAT GAAAAGGACC CAGATTCCAG TTTGGCTTTG ATTGAGACAA 4380 

25 GTGGTGCTTA TAGATTGGTG ACCAAGCCTC AATTTGCAGA GATTTTGAAG GAATACTCTA 4440 

AGGCGCCTAT CAACCAGAGC TTGTCTCGGG CTGCCCTTGA GACCTTGTCC ATTATTGCCT 4500 

ACAAACAGCC GATTACGCGG ATAGAAATTG ATGCCATCCG TGGGGTTAAC TCGAGTGGAG 4560 

30 

CCTTGGCAAA GTTGCAAGCT TTTGACCTGA TAAAGGAAGA CGGGAAAAAG GAAGTATTGG 4620 

GGCGCCCCAA CCTCTATGTG ACTACGGATT ATTTCCTAGA TTACATGGGG ATAAACCATT 4680 

35 TAGAAGAATT ACCAGTGATT GATGAGCTTG AGATTCAAGC CCAAGAAAGC CAATTATTTG 4740 

GTGAAAGGAT AGAAGAAGAT GAGAATCAAT AAGTATATTG CCCACGCAGG TGTGGCCAGT 4800 

AGGAGAAAAG CAGAAGAGCT GATCAAGCAA GGTTTGGTGA CAGTTAACGG ACAAGTGGTG 4860 

40 

CGTGAACTAG CAACCACTAT CAAGTCAGGC GACAAGGTCG AAGTTGAAGG TCAACCTATC 4920 

TACAACGAAG AAAAGGTCTA TTATCTGCTT AACAAACCAC GCGGTGTCAT TTCCAGTGTA 4980 

45 ACAGATGACA AGGGTCGCAA GACGGTTGTC GACCTCTTGC CCAATGTCAA AGAGCGCATT 5040 

TACCCTGTGG GTCGTTTGGA CTGGGATACA TCAGGAGTCT TGATTTTGAC CAATGATGGG 5100 

GACTTTACAG ACGAGATGAT TCACCCTCGT AATGAGATTG ACAAGGTTTA TGTCGCGCGT 5160 

50 

GTTAAAGGTG TGGCCAATAA GGACAATCTT CGCCCCTTGA CCCGTGGTCT TGAGATTGAT 5220 

GGTAAGAAAA CCAAGCCAGC TGTTTATGAA ATTCTCAAAG TGGACCCAGT CAAAAATCGC 5280 

55 TCTGTGGTGC AGTTGACCAT CCATGAAGGG CGTAACCATC AGGTTAAAAA GATGTTTGAA 5340 

GCTGTTGGTC TCCAAGTAGA TAAGTTGTCT CGGACTCGTT TCGGACACCT AGACTTGACA 5400 

GACTCCGTCC AGGAGAATCC CGTCGTCTTA ATAAAAAAGA AATCAGCCAA CTACACACCA 5460 

TGGCTGTAAC TAAGAAATAA TGAAACGAAT TTTAATAGCG CTTGTGCGCT TTTACCAACG 5520 
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TTTTATCTCA CCAGTCTTTC CACCCTCTTG TCGCTTTGAG CTGACTTGTT CCAACTACAT 5580 

GATTCAGGCT ATTGAAAAAC ATGGTTTTAA GGGGGTATTG ATGGGCTTGG CTCGGATTTT 5640 

5 

ACGTTGTCAT CCCTGGTCGA AAACAGGTAA GGACCCCGTT CCAGACCACT TTTCCCTTAA 5700 

ACGAAATCAA GAAGGGGAAT GAGGTGGGGT AAATAGATTT CAAAATGATA AAAACGCATC 5760 

10 CTATCAGGTT TGAGTGAACT TGATAGGATG CGTTTTAGAA TGTCAAAATT TTATAC 5816 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 725 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

TTGAAAATAA TTATGAACCG CAATATATTA ATATCCGAGG \AAAAGGCCCT CTTATCAATG 60 

ACTTGAAAAA AGAAGCTAAA AAAGCTAATA AAGTTTTTCT CGCGAGTGAC CCGGACCGTG 120 

35 AAGGAGAAGC GATTTCTTGG CATTTGGCCC ATATTCTCAA CTTGGATGAA AATGATGCCA 180 

ACCGTGTGGT CTTCAATGAA ATCACCAAGG ATGCAGTCAA AAATGCTTTT AAAGAACCTC 240 

GTAAGATCGA TATGGACTTG GTCGATGCCC AACAAGCTCG TCGGATCTTG GATCGCTTGG 300 

40 

TAGGGTATTC GATTTCGCCT ATTTTGTGGA AGAAGGTCAA GAAGGGCTTG TCAGCAGGTC 360 

GCGTTCAGTC CATTGCCCTT AAACTCATCA TTGACCGTGA AAATGAAATC AATGCCTTCC 420 

45 AGCCAGAAGA ATACTGGACA GTTGATGCTG TCTTTAAAAA GGGAACCAAA CAATTTCATG 480 

CTTCCTTCTA TGGAGTAGAT GGTAAAAAGA TGAAACTGAC CAGCAATAAC GAAGTCAAGG 540 

AAGTCTTGTC TCGTCTGACG AGTAAAGACT TTTCAGTAGA TCAGGTGGAT AAGAAAGAGC 600 

50 

GTAAGGCAAA TGCTCCTTTA CCCTATACCA CTTCATCTAT GCAGATGGGA TGCTGCCAAT 660 

AAAATCAATT TCCGTACTCG AAAAACCATG ATGGTTGCCC AACAAGCTCT ATGAAGGAAT 720 

55 TATAT 725 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 1935 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

15 AACCCATCTG AAGAACTTTT CCGTGCTGCT AGCTCAGCTA TCGATAAAGC AGAAACTAAA 60 

GGTTTGATTC ACAAAAACAA AGCAAGCCGC GATAAAGCTC GTCTTTCAGC TAAACTTGCT 120 

AAATAAGAAA CAGTCCATAG AGGCTGTTTT TTTGTCTCCA AATAGGAAAA GGTAGAAAAT 180 

20 

GAAAATCACA ATTATCGGAT ATTCTGGTTC TGGTAAGTCA ACTCTAGCAG AAAAGTTATC 240 

TAACTACTAC TCCATTCCAA AACTGCACAT GGACACACTC CAATTTCAAC CTGGTTGGCA 300 

25 AGACAGTGAC TGCGAATGGA TGTTAACCGA GATAAAAAAC TTTCTCACCA AGCATAAAGC 360 

TTGGGTCATC GATGGTAATT ATTCTTGGTG CTACTACCAA GAACGAATGC AAGAAGCTGA 420 

CCAAATCATC TTTCTCAATT TTTTGCCATT GACCTGTCTC TTTAGAGCCT TTAAGCGTTA 480 

30 

TCTTAAATAC CGTGGAAAAG TCAGAGAAAG TATGGCGGCA GATTGCCCTG AACGCTTTGA 540 

GTGGGAGTTT ATCAGATGGA TTCTTTGGGA TGGGCGTAGC AAAACTCAAA AAGAAAATTA 600 

35 CCAAAAACTT TGCCAAGAAT ATTCACATAA AGTCACTATC CTTCGAAATC AGAGAGAGCT 660 

AGATCAATTT CTGGATAAGA AAAGGAAGTC CTACAATTCA TAAAGGGCTT CCTTTTTGGC 720 

TATAATTATT CTGCAATCAA GGTTTCCAAA CCAACCTTCA TCATATCAGT GAAGGTATTT 780 

40 

TGACGTTCTT CTGCAGTTGT GTCTTCGTCT GGATTGACCA AGCTATCAGA GATGGTCATG 840 

ATAGCTAGCG CATCAACATG GTATTGGGCA GCAAGATAGT AAAGAGCTGC TGCTTCCATT 900 

45 TCCACAGCCT TGACTCCCCA TTTACCAAGC TCGATATTCT TTTCAAAGTA ATTTGAGTAA 960 

AAGACATCAG ATGACAAAAC GTTCCCAACG TGAGTAGTCA TACCAAGTTC TTTGGCGATA 1020 

TGGTAGGCTT TATCAAGCAA ATCAAAGCTA GCAATTTGTG GAAAATCGTA CTGTGGCCAG 1080 

50 

TCATTACGAA CGATGTTTGA GTTGGTTGCA GCCGCCTGCG CCAAAACTAA TTCACGAACA 1140 

TGAACCTCTT CATTCAAAGA ACCTGCAGTT CCCACACGAA TCAATTTCTT CACACCGTAG 1200 

55 TCTACGATTA ACTCACGCGC ATAAATCGAA ATAGATGGCA TTCCCATCCC AGTTCCCATG 1260 

ACAGATACAC GGTGACCCTT GTAAGTACCA GTGTAACCAA ACATGTTACG CACTTCGTTA 1320 

AAACAAACAG CATCACCAAG GAAATTCTCC GCAATAAACT TAGCACGAAG AGGATCCCCA 1380 

60 

GGAAGAAGAA TTTTATCAGC AATTTCACCC TGCTGAGCAG CAATATGGAT AGACATAATT 1440 
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TATGATACAA AGAGCGAGAA GAAAACGACT GAAAATTAGG AACCTGACGA GAAATCCTGA 1500 

TTTTTCAGTC AGATTATCTA TTTTCCGAGT TTTCCGCTCG TGTTCAAATC AAAACACACG 1560 

5 

CTCTACCTTT CTTTATTTTA TATTTTATAT TGAGAAAGAT ACCAAACCCA TCAAAAAGCG 1620 

AAGGGAAAAT AGGAGTTGGG CGCAGTGAGC GATGCTCGCT AGACCAACTA TCTTTTTCCC 1680 

10 ACTGCTTTTA GGGTGGGGTC AATTCCTTTC TTTCTTAATT TTGATTTAGA GGAGAGTCGC 1740 

CCGTATTCAG TTCAGCGAAT ACAGTTTACC CATCCTTTCG TTTTTATTTT TAGAAAAGTT 1800 

TTCTACTCGT GTTCAAATTA GAACACGCGC TCTACCTTTC TGTTTATACT CTTCGAAAAT 1860 

CTCTTCAAAC CACGTCAACG TCGACTTGGA TTATATATGT GACTGACTTC GTCATCTTTA 1920 

TCTACAACCT CAAAG 19 35 

2 0 (2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2221 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
30 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

TATTATTTTT CCCATCCTAA CTGGAACCTA TGTCGCGCGT GTCTTGGACC GAACTGACTA 60 

40 

TGGTTACTTC AACTCAGTCG ACACTATTTT GTCATTTTTC TTGCCCTTTG CAACTTATGG 120 

TGTCTATAAC TACGGTTTAA GGGCTATCAG TAATGTCAAG GATAACAAAA AAGATCTTAA 180 

45 CAGAACCTTT TCTAGTCTTT TTTATTTGTG CATCGCTTGT ACGATTTTGA CCACTGCTGT 240 

CTATATCCTA GCCTATCCTC TCTTCTTTAC TGATAATCCA ATCGTCAAAA AGGTCTACCT 300 

^ TGTTATGGGG ATTCAACTCA TTGCCCAGAT TTTTTCAATC GAATGGGTCA ATGAAGCTCT 360 

GGAAAATTAC AGTTTTCTCT TTTACAAAAC TGCCTTCATC CGTATCCTGA " TGCTGGTCTC 420 

TATTTTCTTA TTTGTTAAAA ATGAACACGA TATTGTTGTC TATACACTTG TGATGAGTTT 480 

55 ATCGACGCTG ATTAACTACC TGATTAGTTA TTTTTGGATT AAAAGAGACA TCAAACTTGT 540 

TAAAATTCAC CTAAGTGATT TTAAACCACT CTTTCTCCCT CTGACAGCCA TGTTAGTCTT 600 

TGCCAATGCC AATATGCTCT TCACTTTTTT AGATCGCCTC TTCCTCGTTA AAACAGGGAT 660 

TGATGTCAAC GTTAGTTACT ATACCATAGC TCAGCGAATT GTGACCGTTA TAGCTGGGGT 720 
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TGTAACAGGT 



GCAATTGGAG 



TGAGTGTGCC TCGTCTCAGT TACTATCTGG GGAAAGGAGA 



780 



CAAAGAAGCC 



TATGTTTCTC 



TGGTTAATAG AGGTAGTCGA ATCTTTAACT TCTTTATCAT 



840 



TCCACTGAGT 



TTTGGACTCA 



TGGTTTTAGG ACCAAATGCC ATCCTACTTT ACGGTAGTGA 



900 



AAAATATATC 



GGAGGCGGCA 



TCTTGACCTC TCTCTTCGCT TTTCGTACGA TTATCCTGGC 



960 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



CTTAGATACC ATTCTTGGTT CCCAAATTCT CTTTACCAAT GGCTATGAAA AACGTATCAC 1020 

AGTCTATACA GTCTTTGCTG GGCTACTCAA TTTGGGCTTG AATAGTCTCC TTTTTTTCAA 1080 

CCATATCGTG GCTCCTGAAT ACTACTTACT GACAACTATG CTATCAGAGA CTTCTCTACT 1140 

TGTTTTCTAT ATCATTTTCA TCCATAGAAA ACAACTCATC CACTTGGGAC ATATCTTTAG 1200 

CTATACTGTT CGATACTCTC TCTTTTCACT TTCCTTTGTA GCAATTTATT TCCTGATTAA 1260 

TTTCGTGTAT CCTGTAGATA TGGTCATTAA TTTGCCATTT TTGATTAATA CTGGTTTGAT 1320 

TGTCTTGCTA TCAGCTATCT CTTATATTAG TCTACTTGTC TTCACAAAAG ATAGCATTTT 1380 

CTATGAATTT TTAAACCATG TCCTAGCCTT AAAAAATAAA TTTAAAAAAT CATAGGAGTT 1440 

TAAAATGAAA CAACTAACCG TTGAAGATGC CAAACAAATT GAATTAGAAA TTTTGGATTA 1500 

TATTGATACT CTCTGTAAAA AGCACAATAT CAACTATATT ATTAACTACG GTACTCTGAT 1560 

TGGGGCGGTT CGACATGAGG GCTTTATCCC TTGGGACGAC GATATTGATC TGTCCATGCC 1620 

TAGAGAAGAC TACCAACGAT TTATTAACAT TTTTCAAAAG GAAAAAAGCA AGTATAAGCT 1680 

CCTATCCTTA GAAACTGATA AGAACTACTT TAACAACTTT ATCAAGATAA CCGACAGTAC 1740 

GACTAAAATT ATTGATACTC GAAATACAAA AACCTATGAG TCTGGTATCT TTATCGACCT 1800 

GCAGGCATGC AAGCTTGGCG TAATCATGGT CATAGCTGTT TCCTGTGTGA AATTGTTATC 1860 

CGCTCACAAT TCCACACAAC ATACGAGCCG GAAGCATAAA GTGTAAAGCC TGGGGTGCCT 1920 

AATGAGTGAG CTAACTCACA TTAATTGCGT TGCGCTCACT GCCGGCTTTC CAGTCGGGAA 1980 

ACCTGTCGTG CCAGCTGCAT TAATGAATCG GCCAACGCGC GGGGAGAGGC GGTTTGCGTA 204 0 

TTGGGCGCCA GGGTGGTTTT TCTTTTTCAC CAGTGAGACG GGCAACAGCT GATTGCCCTT 2100 

CACCGCCTGG CCCTGAGAGA GTTGCAGCAA GCGGTCCACG CTGGTTTGCC CCAGCAGGCG 2160 

AAAATCCTGT TTGATGGTGG TTCCGCAAAT CGGCAAAATT CCTTATAAAA TCAAAAGGAA 2220 

T 2221 
(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1509 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

TGAATTTTGA ACAGTACACA GAATACTAAA ATATTTCTAG AAATTAATTT GAATTTTCTA 60 

ATTGGATTTG TCGCATCTTA TTTCAATCTA CTATAGAAAA AAGTCTTTAA AATTATAAAA 120 

CGCATCATAT CAAGGTTTTT CAAAAACCTT GATATGATGC CTTTTATTGT GGGAATATGT 180 

ATTTCATTTT CTACTAAAAT TATGTTTTTG AATAACCTCT ATCTTAGTAG TTTGTATAAT 240 

CCCCCTCAAT CAGTTTTTGC GATAAGCTTT AATGCTATGA CTATACCACT CTTGCATTTC 300 

TTTTGATGGT GGTGTCATAT AATCGCCATA CATCTGGGTT AAAAATTGGT CATATTTTTT 360 

GGGAACAGGC AACATACGGC CCTCAAACTC AGTTAAAATC AGTTCTTTAA AGGTATCAAC 420 

TGGGAAGATT TCTTTCATCC CTTCCTTACC GATCCCAACT CCTCCTTCAT ATTGAGGAGT 480 

GTTGGTTACA GCATTTTTGA CTAGTTGATC AATTTTCTTG TAAAAGTAGC GAGGATTGAC 540 

AAATCGGAGA GCGTACCAGC TACATAATCT AAGAAAATCT TTTAGTTTGC TATCACCGTG 600 

AACTGCTCGT GATTTTTTGA TATAAGCTAG TTGACGAAGA GCCACATACT TATAGCTCTT 660 

GTCGACAATG CTCAAGTCTG TAAATCGATC AATTGGGAAG ACATCGATGA AAAGGCTGGT 720 

ATCATGACGC TTGTACTTAA CATGGTCTTC TATAACAGTA GAAGTGTCCA AAATCGATGC 780 

GAAATTATGG AAGTACCAAG AAGATGTATC GTAGGAAAGA ACCTTGTAGC GAGGGTGATT 840 

TTCTTCTTCA ATAATCTTCA GTAAACGCTC ATAATCCTCA CGATAAAGGG AAATATCAAT 900 

ATCATCATCC CAAGGAATCA TACCTTTGTG GCGGATGGCT CCAAGCATGG TTCCATAACT 960 

GAGAAAATAA GGAATATCAT GTTTCTTACA AGTCTCATCA ATATAGTCCA GCAGGGCTAG 1020 

TTGAATTTCT TTAATTTCTT TTTTTTCTAA ATATTGCATC CTAATCCTCC AATTTATAAG 1080 

CGTGAAATTC ATGACTGTAG AAGCGTTTTT CTTCTGGTGG TAGGGTCATA TAATCTCCAT 1140 

AAAATTGTGT CAAAATAGTA TCAAATTTTT CAGGTGCAGG AAGGCTTAAA TTCTCAAAGG 1200 

GTAAATCGAT TGTTTTATCA AAGGTACCAC TTGGGAAGAC TTCCTTTTCC TTAAATTTTG 1260 

AAGGGATAAA AGCCATATAT TGCCCATTTT CACGACTATA TTTTTGAATT TCTTTCTCGA 1320 

TTTTATTTGC AAAATAACGA GGAGAAACCG GTCGAAGGAG TAACCAGAAG GCTGTTCGTA 1380 

TCCAATCTTT TAAAAGGCTA TCCTTATAGA CAATATTTTT ATGTTTACTG AAAGACAGCA 1440 

GTTTGAAGCT TTCCAGTTTA TAACAAGTAT CAATGACCTT AGGATCATCA AAGCGATCTA 1500 
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TAGGGAAAA 1509 
(2) INFORMATION FOR SEQ ID NO:60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 671 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 



ACAAGGGATT 


TATTCCTTGG 


GACGACGACC 


TAGACTTTTT 


TATGCCTCGT 


AAAGATTATG 


60 


AGAAATTAGC 


AGAATTATGG 


CCTCGTTATG 


CAGATGAACG 


TTATTTCTTG 


TCAAAGAGTC 


120 


ACAAGGATTT 


TGTTGATCGT 


AATCTTTTTA 


TTACCATTCG 


TGACAAGAAA 


ACCACCTGTA 


180 


TCAAGCCTTA 


TCAGCAGGAT 


TTGGATTTGC 


CACATGGTCT 


GGCCTTGGAT 


GTTTTGCCTT 


240 


TGGATTATTA 


TCCGAAAAAT 


CCAGCTGAGC 


GGAAAAAACA 


GGTTCGTTGG 


GCCTTGATTT 


300 


ATTCACTCTT 


TTGTGCGCAA 


ACTATTCCAG 


AAAAGCATGG 


TGATCTCATG 


AAATGGGGAA 


360 


GTCGCATTTT 


ACTGGGTTTG 


ACTCCAAAAT 


CTCTCCGTTA 


TCGCATCTGG 


AAAAAAGCTG 


420 


AGAAAGAAAT 


GACTAAGTAT 


GATTTGGCTG 


ATTGTGATGG 


CATTACAGAA 


TTATGCTCAG 


480 


GTCCTGGCTA 


CATGAGAAAC 


AAGTACCCAA 


TCACATCTTT 


TGAAGACAAT 


CTTTTCTTGC 


540 


CATTTGAAGG 


AACAGAGATG 


CCTATTCCAA 


TCGGCTATGA 


TGTCTATCTC 


AGAACTGCTT 


600 


TTGGGGATTA 


TATGACGCCT 


CCACCAGCAG 


ACAAGCAGGT 


ACCGCATCAT 


GATACTGTCA 


660 


CTGCTGATAT 


G 










671 



(2) INFORMATION FOR SEQ ID NO: 61: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2766 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



15 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

ATCTTATACA AGTCGTAAGC CGCTTCCTTA AAACCAGCTT CTAGTAATTC TTCCAATAAG 60 

ATAGTAACCT TCACACCATT TGGTGTTCCC AGTGAATAAA GCTGAAAAGC TTGTTCTCCT 120 

TTTGGCAAGT TTTGTTCGAA ACGGGCACCT GCTGTTGGTC TGTTTAGCCC CGTAAAAGCT 180 

10 CCTTGATTAC TAGCTTCATC CTGCCATACG GTCGGTAATT GATATGCTGA CATCCGAGAC 240 

CTCCCTTAAA TCGCATTCTT GTCAAAACCG AGTTTGCGTT GAATAAACTT AACGATTTCG 300 

ACGATGATAA TCATTGAGAA GCTTCCAGCC ATAACAATTC CCCATTGTGA CAAGTCTAGT 360 

TTGGTTACGT GGAAGATTCC TTCAAGCGGT TCTACAACGA TTGTTGCCAT GAGAAGGATA 420 

AAGGATACCA AGATGGACCA GTTAAAGGTC TTAGACTTGA ATGGGCCAAC TGTCAAGATG * 480 

20 GATTGGTAGA CAGACTTGAC ATTGTAGGCA TGGAAGAGCT GAATCAAACC AAGGGTTGCA 540 

AAGGCCATCG TTAGGGCATC TGCATGAATA GCATGATTGT CACCCACATG AACTGGGTAA 600 

GCAATCGCAA GGCCATAAAC ACTCATAACA AGAGCTGCTT GGAGTACACC TTGATAAATG 660 

ATAGAACTCA AAACACCACC TGAGAAGAAG CTTGCCTTGC GTCCACGTGG TTTATGATTC 720 

ATGACACCAG GTTCCGCAGG TTCAACACCA AGAGCGATAG CTGGGAAGGT ATCCGTTACC 780 

30 AAGTTGATCC ACAAAAGATG AACCGGCTGC AAGACATCCC AACCAAACAA GGTTGATAGG 840 

AAGATGGTTA ATACTTCAGC AGTATTAGCA GAAAGTAGGT ACTGAATAGT CTTTTGAATG 900 

TTTGAGAAGA CCTTACGTCC TTCTTCCACT GCGACGATAA TAGTCGCAAA GTTATCATCT 960 

GCAAGAATCA TATCAGAAGC CCCCTTAGAA ACCTCTGTAC CAGTGATTCC CATACCGATA 1020 

CCAATATCGG CTGTTTTCAG AGCTGGCGCG TCATTGACAC CGTCACCTGT CATGGCAACG 1080 

40 ACTTTACCTT GTTTTTGCCA AGCCTTGACG ATACGAACCT TGTGTTCTGG AGACACACGG 1140 

GCATAAACAG AGTATTGACC AACAACTTTT TCAAATTCTT CATCTGACAG TTCATTGAGT 1200 

TCAGCACCAG TTAAAACGTG ACCTTCTGTA TCGTTTGCGT CAATGATTCC CAAACGTTTG 1260 

GCAATGGCTT CCGCTGTGTC TTGGTGGTCA CCTGTAATCA TAATTGGACG GATTCCCGCT 1320 

TCCTTAGCCA CACGAACAGC CTCAGCGGCT TCAGGACGTT CAGGGTCAAT CATCCCAATC 1380 

50 AAACCAGTAA AAATTAAATC ATTTTCAAGC TCTTCAGAAG TGAGATTTTC TGGAATACTA 1440 

TCGATAATCT TATAAGCACC TGCAAGGACA CGCAAGGCTT GAT GAG C CAT TTCAGAATTG 1500 

TTTGTATGAA TGAGATTTGT AACCTTCTCA TCAATCGGAG CAATATCCCC AGCCTTATCA 1560 

CGAAGAAGAC AACGTTTTAA GAGTTGGTCT GGCGCACCCT TGACTGCTAC AAGGAAACGA 1620 

CTATCTGGCA ATGGGTGAAC TGTTGACATG AGCTTACGGT CAGAGTCAAA TGGCAATTCA 1680 

60 GCTACACGAG GATATTTCTC TAAGAAACCT TTGACATCAT AGCCCTTGTC CAAGGCATAT 1740 



35 



45 



55 
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TGGATAAAGG CTGTTTCGGT TGGGTCACCA ATCAAGTTAC CTTCCACATC GATTTTCGTA 1800 

TCATTGGCCA AGACAACTGA ACGAAGTAGT GGCATTTCAA GACCTAGTTC AATATCATCA 1860 

GCTGAGTCAT GTAGAACCGC ATCGTAGAAG ACTTTTTCGA CTGTCATCTT GTTCATAGTC 1920 

AGCGTACCAG TCTTATCAGA AGCGATGATT TCAGTTGAAC CAAGTGTTTC AACTGCTGGC 1980 

AACTTACGAA CGATGGAATG TCGTTTGGCC AAAACTTGAG TACCAAGAGA AAGAACGATG 2040 

GTAACGATAG CAGGAAGTCC TTCTGGAATG GCTGCAACAG CAAGGGCAAC AGAAGTCAAC 2100 

AACTCACCAA GTGGATTTTT CCCTTGAATG AAGACACCCA CTACAAAAGT AACAAGGGCA 2160 

15 ATGACCAAGA TAGCATAGGT CAAGACCTTA GAAAGGTTGT TCAAGTTTTG TTTGAGTGGT 2220 

GTATCAGTCT CATCCGCATC TTGAAGCATA CCAGCAATAT GACCAACTTC AGTATACATA 2280 

CCTGTATTGA CAACAACACC CATCCCACGA CCATAGGTTA CGTTTGAGTT TTGGAAGGCC 2340 

20 

ATGTTGACAC GGTCACCAAT GCCAGCATCT GTCGCAAGAT CGACTGACAA GTCTTTTTCG 2400 

ACTGGTACAG ATTCACCTGT CAAGGCTGCT TCTTCAATTT TAAGAGAGTT GGCTTCTATC 2460 

2 5 AAACGTAGGT CCGCTGGTAC CACGTCACCT GCTTCAAGGG CAACGATATC GCCTGGTACC 2520 

AATTCTTTAG AGTCAATCTC TGCCATGTGT CCATCACGAA GAACGCGGGC AACTGGACTA 2580 

GACATGGATT TGAGGGCTTC AATAGCTTCT TCAGCTTTTC CTTCTTGGTA AACACCAAAG 2640 

30 

GCAGCGTTGA TGATAACCAC AGCTAGGATG ATAATGGCAT CTGCGATATC TTCCCCACCA 2700 

GAAGTCACGA CTGACAAGAT TCTGCCGCAA CTAGGATGAT AATCATCAAA TCCTTAAATT 2760 

35 GCTCGA 2766 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 1577 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 <ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

50 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

TGGATTTATC CTCTTTTTCG TTCTTTTGGG AGCAGTTTTT GAGGAAAAAA TGAGAAAAAA 60 

TACGTCCCAA GCTGTGGAGA AATTACTGGA CTTGCAAGCT AAAACCGCAG AAGTCTTGAG 120 

60 TGATGATAGT TATGTCCAAG TTCCTTTGGA ACAAGTCAAG GTAGGCGACC TGATTCGAGT 180 
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GCGTCCCGGT GAAAAGATTG CTGTTGATGG TGTCGTAGTA GAAGGTGTCT CTAGTATTGA 240 
CGAATCCATG GTGACAGGTG AGAGTCTGCC TGTGGACAAG ACAGTTGGAG ATACTGTCAT 300 
TGGCTCAACC ATCAATCATA GTGGAACGCT TGTCTTTAGA GCAGAAAAAG TTGGCTCAGA 360 
GACTGTTTTG GCTCAGATTG TGGATTTTGT GAAGAAAGCT CAGACAAGTC GTGCGCCGAT 420 
TCAGGACTTG ACGGATAAGA TTTCAGGGAT TTTTGTCCCA GTAGTTGTCA TTTTAGGAAT 480 
CATGACCTTT TGGGTTTGGT TCGTCTTGCT CAGGGATAGT GTGGTCGTGC TTGGAGCTAG 540 
CTTTGTGTCC TCTCTTCTCT ACGGAGTGGC GGTTTGATTA TCGCCTGTCC TTGTGCCTTG 600 
15 GGACTTGCAA CACCGACAGC CCTTATGGTG GGGACAGGAC GTAGTGCCAA GATGGGGGTT 660 
CTCCTCAAAA ATGGAACTGT CTTACAGGAA ATCCAGAAAG TTCAAACTCT TGTCTTTGAT 720 
AAGACCGGGA CTTTGACGGA AGGGAAACCT GTGGTAACAG ATATCATCGG CGACGAAGTA 780 

20 

GAAGTGTTTG GATTGGCAGC CTCCTTGGAA GATGCTTCTC AACACCCACT GGCTGAGGCT 840 
ATCGTTAAGC GAGCGAGTGA AGCTGGACTT GAGTTTCAAA CTGTTGAAAA TTTTCAGGCC 900 
25 TTGCACGGGA AAGGTGTTTC AGGGCGAATC AATGGAAAAC AAGTTTTACT TGGAAATGCT 960 

AAAATGCTGG ATGGCATGGA TATTTCTAAT ACTTATCAAG ATAAACTAGA AGAACTAGAA 1020 

AAAGAAGCTA AGACAGTTGT GTTTTTAGCT GTTGACAATG AAATCAAAGG CTTGCTTGCT 1080 

30 

TTGCAAGATA TTCCTAAGGA AAATGCTAAG CTAGCCATCA GTCAGCTAAA AAAACGTGGT 1140 

CTCCGAACAG TCATGCTGAC AGGAGACAAT GCTGGTGTGG CGCGTGCTAT TGCAGATCAA 1200 

35 ATCGGAATTG AAGAGGTCAT TGCAGGCGTC TTGCCAGAAG AAAAAGCCCA TGAAATCCAT 1260 

AAACTGCAAG CGGCTGGCAA AGTAGCCTTT GTTGGGGACG GTATCAATGA CGCTCCTGCC 1320 

CTTAGTGTAG CAGATGTGGG AATTGCTATG GGTGCTGGAA CAGATATCGC CATCGAGTCA 1380 

40 

GCAGATTTGG TGTTGACAAC CAATAATCTT TTAGGAGTGG TTCGTGCCTT TGATATGAGT 1440 

AAGAAAACCT TTCATCGAAT TCTACTCAAT CTTTTCTGGG CTTTTATCTA CAATGTTGTC 1500 

45 GGAATTCCGA TTGCAGCAGG AGTCTTTTCA GGTGTTGGCT GGCTCTCAAC CCAGATTGGC 1560 

AAGGCTAGCC CAATGGC 1577 
(2) INFORMATION FOR SEQ ID NO: 63: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1089 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
55 (D) TOPOLOGY: linear 



60 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

AAAATGATAT . AATAGAATTT ATGGATAAAA ATAAGATTAT GGGATTAACC CAAAGAGAAG 60 

TCAAGGAAAG ACAGGCTGAG GGTTTGGTCA ATGACTTTAC CGCATCAGCC AGTACCAGCA 120 

CTTGGCAAAT CGTTAAACGA AATGTCTTTA CCCTTTTTAA CGCTTTGAAC TTTGCCATTG 180 

CTTTGGCCCT TGCCTTTGTG CAGGCTTGGA GCAATCTGGT CTTCTTTGCT GTTATCTGCT 240 

15 TTAACGCTTT TTCTGGGATT GTGACCGAGC TACGAGCCAA ACACATGGTG GACAAGCTCA 300 

ATCTCATGAC CAAGGAAAAG GTCAAAACCA TCCGTGATGT CAGGAAGTTG CTCTTAATCC 360 

TGAAGAATTA GTGCTAGGAG ATGTCATTCG TTTGTCTGCA GGAGAGCAGA TTCCTAGTGA 420 

20 

TGCCTTGGTT TTGGAAGGCT TTGCGGAAGT CAATGAAGCC ATGTTAACGG GAGAAAGTGA 480 

TTTGGTGCAA AAGGAAGTTG ACGGCTTACT TTTGTCAGGA AGTTTCCTAG CCAGTGGGTC 540 

25 AGTTTTATCT CAAGTTCACC ATGTCGGTGC AGACAACTAT GCTGCCAAAC TCATGCTTGA 600 

GGCTAAGACC GTTAAACCCA TCAACTCCCG TATCATGAAA TCGCTGGACA AGTTGGCTGG 660 

TTTTACTGGG AAGATTATCA TTCCCTTTGG TCTGGCTCTC TTGCTGGAAG CCTTGCTTTT 720 

30 

AAAAGGCCTG CCTCTCAAGT CATCCGTTGT AAACTCGTCG ACAGCTCTTT TGGGAATGTT 780 

GCCTAAGGGA ATTGCCCTTT TGACCATTAC TTCGCTCTTG ACTGCAGTGA TTAAGTTGGG 840 

35 CTTGAAAAAG GTCTTGGTGC AGGAGATGTA CTCTGTTGAG ACCTTGGCGC GCGTGGATAT 900 

GCTCTGTCTG GACAAGACGG GCACCATCAC CCAAGGAAAG ATGCAGGTGG AGGCTGTTCT 960 

TCCACTGACG GAAACTTACG GTGAAGAGGC TATTGCCAGC ATCTTGACTA GCTACATGGC 1020 

40 

CCATAGTGAG GATAAGAATC CAACTGCCCA AGCCATTCGC CAGCGTTTGT GGGAGATGTT 1080 

GCTTATCCT 1089 
45 (2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 731 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS: single 

<(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

55 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

60 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
GCTAGCAATA TCATGTTTAT GCTTGATTTG GGGAATCATT TAGATCAGTG GTCCTTGAAA 60 
AAAACTGCAA CAGATTTAGA ACAGAGTCTT CTTGCAAAAG AGAGCGATGT ATTCCTAGTA 120 
CAGGGCGATA CGGTTGTTAG TATCAAGAGT TCCGATGTTC AAATAGGAGA TGTCTTGATC 180 
TTATCTCAAG GAAATGAAAT TCTGTTTGAT GGACAAGTAG TTTCAGGTTT AGGTATGGTC 240 
AACGAAAGTT CCTTGACAGG AGAGAGTTTT CCAGTTGAAA AAAGAGAGTC TGATTTGGTT 300 
TGTGCAAATA CAGTATTAGA AACTGGAGAG TTACGCATTC GTGTAACAGA TAATCAGATG 360 
15 AACAGCCGTA TTTTACAGCT GATTGAGTTG ATGAAGAAAT CTGAAGAAAA CAAGAAAACG 420 
AAACAACGCT ATTTCATCAA GATGGCGGAT AAAGTCGTCA AATATAATTT CTTGGGGTCT 480 
GGGCTGACTT ACCTATTGAC AGGTTCTTTT TCTAAGGCTA TTTCTTTCCT ATTGGTCGAT 540 

20 

TTCTCCTGCG CTTTGAAAAT CTCTACTCCT GTAGCTTATT TGACAGTTAT CAAGGTAGGG 600 
TTGAACCGTG AAATGGTGAT TAAGGATGGA GATGTTCTGG AGAAATATCT GGTAGTTGAT 660 
25 ACTTTCTTGT TTGATAAGAC AGGACCAATC ACAACTAGTT ATCCTATAGT TGAAAAGGTG 720 
TACCCTTTGG G 731 
(2) INFORMATION FOR SEQ ID NO: 65: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2197 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

40 

(iv) ANTI-SENSE: NO 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
TATATTATTC CATTTGTGGT AAATCTGTAC ATGATAGATT AAGTACTCCG ACTGAAACCA 60 
50 GTACACTAAT CAAGCTATAG CCAGCTAACA AAAGGAGTAA CCATAGAATA TTAACTTTTA 120 
AATTTTCCTT CATCGTTTAC ACCTTCTCTT TCACATTCTT ACCAAGGATA CCAGCTGGGC 180 
GGACAATCAA GATCAACAAC AAGATTCCAT AAACAATGGC ATCACGGAAA TCTGACATCC 240 

55 

CAAAGGCTGT CGCAAAGGTT TCCAATAGAC CAATCACAAA GCCACCAAGA GCCGCACCAG 300 
GAATAATTCC GATACCACCA AGTACTGCGG CAACGAAAGA TTTAAGACCT GGAGTAACCC 360 
60 CCATCAAAGG CTCAAGAGAG TTATAATAAA GAGCAATCAG AACACCAGCC GCACCCGCAA 420 
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GAGCAGAACC CAAAGCGAAG GTAAAGCTAA TCGTACGGTT TATATTGATC CCCATCAATT 480 
GCGCCGCGTC GCTATCTACT GATACTGCAC GCATGGCTTT CCCCATCTTA GTCTTTTGGA 540 
CAATGACTTG TAACAAAATC ATCAAAATCA AGGAAATGCC CAAAATCATT AACTGCACAT 600 
TTGTTAAGCT AATTGGTCCC AAATCATATC GAACTGTTTG AATCGCTTGA GGGAAGGCAC 660 
GGGTATTGGC ACCAACCAGA TAGACCATTC CATACTCCAA TAGGAAAGAA ACCCCAATAG 720 
CCGTAATCAA AACAGCAATA CGAGTAGAGT GGCGCAAAGG TCGGTAAGCA AGAAACTCAA 780 
TCACGACACC AAGAATAGCT GTCGCTAGCA TAGCTACAAT AAGCGCTACA AAGAAATTCA 840 
15 TTTGGAAAGA ATTGATCAAG AAATAACCGA TAAAGGCTCC CAT CAT AT AA ATATCACCAT 900 
GGGCGAAGTT GATGAGCTTG ATAATTCCGT AAACCATGGT ATATCCTAGG GCTAACAGCG 960 

CGTAAACACT ACCTAGAATC AAACCATTTA CGAGTTGTTG GAGCATAAGA TTCACTCTTT 1020 

20 

CTATTTATAA TTCCGAGGGT TTTCCCTCAC TTTTTGATAG GTTCTTATAC TCAATGAAAA 1080 

TCAAAGAGCA AACTAGGAAA CTAGCCGCAG GTTGCTCAAA GCACTGCTTT GAGGTTGTAG 1140 

25 ATAAGACTGA CGAAGTCAGT CACATATATA ATCCAAGGCG ACGTTGACGC AGTTTGAAGA 1200 

GATTTTCGAA GAGTATTAAA TATCGAAACA GGGAGTGAGT CAAAGGCTCA TTCCCTATTT 1260 

CAACATTTTT CTATTATGGT TTTACAACTT CTGCTGCTTC AACTTTACCA TTGTTCATGG 1320 

30 

TCATCATGTA AGCAGTTTTG ACTGTGTTGT GGTCTGCATC GAAGCTTGTT TGACCAGTTA 1380 

CACCTTCAAA ATCTTTTGTT TTAGCAAGGT TATTCTTGAT TTCACCTGAA TTTTTAGCAC 1440 

35 CTTTTGCTGC GTTTGCTACA AGGTGAACTG AATCATAAGC CAAGGCTGCA AATGTTGAAG 1500 

GCTCTTCATT GTACTTAGCA CGGTAAGCGT CAAGGAAGGC TTTAGCTTTA GCTGAAACTT 1560 

CTACAGTAGT TGAGAAGCCT GAGATAAAGT AGATGTTTGA TGCTTTTTCA GCAGTTGCTT 1620 

40 

GTTGTACAAA CTCCTCACCG TTGAATCCAT CACCACCAAC GATTGGTTTG TCAATTCCCA 1680 

TACCACGCGC TTGGTTTACA ATCTTACCAG CCTCATTATA GTAACCAGGA ACAACGATAG 1740 

45 CATCAAAGTC TTTCCCTTTC ATTTTTGTAA GGGCTGCTTG GAAGTCTGTG TCACCTGCTA 1800 

CGAAAGTTTC ATCTGCAACG ATTTCACCCT TGTATGACTC GCGGAAAGAT TTGGCAATCC 1860 

CTTTAGCATA GTCACTGGCA TTGTCAGTGT AAAGAACAAC TTTCTTAGCA TTTAATTTTT 1920 

50 

CAGAAACATA GTTTGAGATA ATTTTTCCTT GGAAGCTATC TTGGAAAGTT CCAATAAAGA 1980 

GGTAATCTTG ACCTTTAGTC AATCCATCTT GAGTCGCACT TGGTGAGATC AATGGAACAC 2040 

55 CTGCTTTTGT AGCGTTCGCT ACCGCAGCTG CAGTCGCACC AGATGTCGCA GGTCCTACGA 2100 

CTGCTGATAC TTTAGATTGG GTTACAAGGT TAGTTGTAAC TGAAGCAGCC TCAGCTGTTT 2160 

CAGACTTATT ATCTTTATCG ACTACTTCGA TTTGTTT 2197 
(2) INFORMATION FOR SEQ ID NO: 66: 



60 



WO 98/26072 PCT/US97/22578 



-129- 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 900 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

10 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



15 



20 



30 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 



TGTCCCAAGA 


CCAGACTTGG 


TATGCTCTGG 


CCTATGATGG 


GGCAGAAGTG 


ATTGGCTTTC 


60 


TAACTGTTCA 


GGAGACTCTC 


TTTGAAGCAG AAGTCCTGCA AATCGCTGTC 


AAAGGAGCTT 


120 


ATCAGGGTCA 


GGGAATTGCG 


TCAGCCTTGT 


TTGCTCAATT 


GCCGACAGAC 


AAGGAAATTT 


180 


TCCTCGAAGT 


CAGACAGTCA 


AATCAACGAG 


CGCAAGCATT 


TTACAAGAAA 


GAAAAGATGG 


240 


CAGTTATCGC 


TGAGCGAAAG 


GCCTACTACC 


ATGACCCAGT 


CGAGGACGCC 


ATTATCATGA 


300 


AGAGAGAAAT 


AGATGAAGGA 


TAGATATATT 


TTAGCATTTG 


AGACATCCTG 


TGATGAGACC 


360 


AGTGTCGCCG 


TCTTGAAAAA 


CGACGATGAG 


CTCTTGTCCA ATGTCATTGC 


TAGTCAAATT 


420 


GAGAGTCACA 


AACGTTTTGG 


TGGCGTAGTG 


CCCGAAGTAG 


CCAGTCGTCA 


CCATGTCGAG 


480 


GTCATTACAG 


CCTGTATCGA 


GGAGGCATTG 


GCAGAAGCAG 


GGATTACCGA 


AGAGGACGTG 


540 


ACAGCTGTTG 


CGGTTACCTA 


CGGACCAGGC 


TTGGTCGGAG 


CCTTGCTAGT 


TGGTTTGTCA 


600 


GCTGCCAAGG 


CCTTTGCTTG 


GGCTCACGGA 


CTTCCACTGA TTCCTGTTAA 


TCACATGGCT 


660 


GGGCACCTCA 


TGGCAGCTCA 


GAGTGTGGAG 


CCTTTTGGAG 


TTTCCCTTGC 


TAGCCCTTTT 


720 


AGTTCAGTGG 


GTGGGGCACA 


CAGAGTTGGT 


CTATGTTTCT 


GAGGCTGGCG 


ATTACAAGAA 


780 


TTGTTGGGGA 


AGACACGAGA 


CGATGCAGTT 


GGGGAGGCTT 


ATGACAAGGT 


CGGTCGTGTC 


840 


ATGGCTTGAC 


CTATCCTGCA 


GGTCGTGAGA 


TTGACGAGCT 


GGCTCATCAG 


GGGCAGGATA 


900 



50 (2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1023 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
60 (iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
CCGGCGATCT TCCGCTAGAA ATAGTCTACC AAGATGAGGA TGTGGCTGTC GTTAACAAAC 60 
10 CTCAGGGAAA TGGTTGTGCA CCCGAGTGCT GGTCATACCA GTGGAACCCT AGTAAATGCC 120 
CTCATGTATC ATATTAAGGA CTTGTCGGGT ATCAATGGGG TTCTGCGTCC AGGGATTGTT 180 
CACCGTATTG ATAAGGATAC GTCAGGTCTT CTCATGATTG CTAAAAACGA TGATGCGCAT 240 

15 

CTAGTACTTG CCCAAGAACT CAAAGATAAA AAGTCTCTCC GCAAATATTG GGCGATTGTT 300 
CATGGAAATC TGCCTAATGA TCGTGGTGTA ATTGAAGCGC CGATTGGCCG GAGTGAAAAA 360 
20 GACCGTAAGA AACAGGCTGT AACTGCTAAA GGGAAGCCTG CAGTGACGCG TTTTCACGTC 420 
TTGGAACGCT TTGGCGATTA TAGCTTAGTA GAGTTGCAAC TGGAGACAGG GCGCACTCAT 480 
CAAATCCGTG TCCACATGGC TTATATCGGC CATCCAGTCG CTGGTGATGA GGTCTATGGT 540 

25 

CCTGCAAGAC TTTGAAAGGA CATGGACAAT TTCTTCATGC CAAGACTTTA GGTTTTACTC 600 
ATCCGAGAAC AGGTAAGACC TTGGAATTTA AAGCAGATAT CCCAGAGATT TTTAAGGAAA 660 
30 CCTTGGAGAG ATTGAGAAAG TAAGAATGAA AAAGAAATTA ACTAGTTTAG CACTTGTAGG 720 
CGCTTTTTTA GGTTTGTCAT GGTATGGGAA TGTTCAGGCT CAAGAAAGTT CCAGGAAATA 780 
AAATCCACTT TATCAATGTT CAAGAAGGTG GCAGTGATGC GATTATTCTT GAAAGCAATG 840 

35 

GACATTTTGC CATGGTGGAT ACAGGAGAAG ATTATGATTT CCCAGATGGA AGTGATTCTC 900 
GTTATCCATG GAGAGAAGGA ATTGAAACGT CTTATAAGCA TGTTCTAACA GACCGTGTCT 960 
40 TTCGTCGTTT GAAGGAATTG AGTGTCCAAA AACTTGATTT TATTTTGGTG ACCCATACCC 1020 
ACA 10 23 
(2) INFORMATION FOR SEQ ID NO: 68: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 675 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
50 (D) TOPOLOGY: linear 



55 



60 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
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GCTCGGTACC CGGGGATCCT CTAGAGTCGA TAATATCAAC CTGCAGGTTG ATGAACGAGA 60 

TCGGATTGCT CTTGTTGGGA AAAATGGTGC AGGTAAGTCT ACTCTTTTGA AGATTTTAGT 120 

5 

TGGAGAAGAG GAGCCAACTA GCGGAGAAAT CAATAAGAAA AAAGATATTT CTCTGTCTTA 180 

CCTAGCCCAA GATAGCCGTT TTGAGTCTGA AAATACCATC TACGATGAAA TGCTTCATGT 240 

10 CTTTAATGAT TTGCGTCGGA CGGAGAGACA ACTGCGTCAG ATGGAGCTGG AGATGGGTGA 300 

AAAGTCTGGT GAGGATTTGG ATAAACTGAT GTCAGATTAT GACCGCTTAT CTGAGAATTT 360 

TCGCCAAGCA GGTGGCTTTA CCTATGAAGC TGATATTCGA GCGATTTTGA ATGGATTCAA 420 

15 

GTTTGACGAG TCTATGTGGC AGATGAAAAT TGCTGAGCTT TCTGGTGGTC AAAATACTCG 480 

TTTGGCACTT GCCAAAATGC TCCTTGAAAA GCCCAATCTC TTGGTCTTGG ACGAGCCAAC 540 

20 TAACCACTTG GATATTGAAA CCATCGCCTG GCTAGAGAAT TACTTGGTAA ACTATAGCGG 600 

TGCCCTCATT ATCGTCAGCC ACGACCGTTA TTTCTTGGAC AAGGTTGCGA CAATTACGCT 660 

AGATTTGACC AGCAT 675 
(2) INFORMATION FOR SEQ ID NO: 69: 



25 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 582 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



40 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

45 TAGAGTCGAT AGCAATAGAT TGAGTAAGTT GTAGCCTTAC CAGCATCGAT AGAAACAAGG 60 

GCACCACGGT GACGTCCACC AATTTCCCCT GGAATCAATG GCAAGTATTG GTCGAAGGTA 120 

TGGTTCATGA TACCGTAACC ACGAGTCATT GACAAGAACT CAGTTGAGTA TCCAATCAAA 180 

50 

CCACGCGCTG GAACAAGGAA GACCAAACGA GTTTGACCAT TACCAGTTGA AATCATATCC 240 

AACATTTCAC CTTTACGTTC AGAAAGGCTT TGGATAACAG ACCCTTGGTA TTCTTCTGGA 300 

55 GTGTCGATTT GTACACGTTC AAATGGTTCA CATTTAACAC CGTCGATTTC TTTTACGATA 360 

ACTTCTGGAC GAGATACTTG AAGTTCATAG CCCTCACGAC GCATTGTTTC GATAAGGATT 420 

GACAAGTGCA ATTCTCCACG TCCTGAAACA GTCCATTTAT CTGCGTGAAT CAGTTGGGTC 480 

AACACGAAGG AACGTCTGTT TGCAATTCTG CCTGCAAGCG TTCTTCCACC TTACGAGAAG 540 



60 
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TTACCCATTT ACCTTCTTTA CCAGCAAATG GTGAGTTGTT GA 



582 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1337 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

TTGGATTGAA GAACAAAGAT TTGGACTCTA TTGACCTTAT GGTTTGGGGG AAATTTGGAA 60 

TTTCAAAGTC GCCCAACCCC CTCATTTCTA AAGAATTGGA AGCCGGATGG GACTCTACCA 120 

AACACGTTTA ACCCAAAGAA AATTGGCAGG AAGAAATGGA AAAAATTTGA TTTTTAAAAA 180 

ATACTTAAGG AAACTTTAAG CTAGGGAGTG TACCCTAAGT TCAATAAAGT TAAAGAAGAC 240 

CTTAACTTAA ACTCCTAAAA CTTTTTCAAT AATAATCTCC CTATAAAAAT AAAGTCGCCC 300 

AATCAGGCGG CTTAATTTTT TTGAAAAATG GGCTTGGTGC CTGAGAATAA ATAGCTTAGT 360 

GATAGAAGAA AATGGGGAAA TATGGTATAA TGAAACGATA GATTTTTGAA TAGGAATAAG 420 

ATCATGTTTG GATTTTTTAA GAAAGATAAG GCTGTGGAAG TAGAGGTTCC GACACAGGTT 480 

CCTGCTCATA TCGGCATCAT CATGGATGGC AATGGCCGTT GGGCTAAAAA ACGTATGCAA 540 

CCGCGAGTTT TTGGACACAA GGCGGGCATG GAAGCATTGC AAACCGTGAC CAAGGCAGCC 600 

AACAAACTGG GCGTCAAGGT TATTACGGTC TATGCTTTTT CTACGGAAAA CTGGACCCGT 660 

CCAGATCAGG AAGTCAAGTT TTCATGAACT TGCCAGTAGA GTTTTATGAT AATTATGTCC 720 

CGGAACTACA TGCGAATAAT GTTAAGATTC AAATGATTGG GGAGACAGAC CGCCTGCCTA 780 

AGCAAACCTT TGAAGCTTTA ACCAAGGCTG AGGAATTGAC TAAGAACAAC ACAGGATTGA 840 

TTCTTAATTT TGCTCTTAAC TATGGTGGAC GTGCTGAGAT TACACAGGCG CTTAAGTTGA 900 

TTTCCCAGGA TGTTTTAGAT GCCAAAATCA ACCCAGGTGA CATCACAGAG GAATTGATTG 960 

GTAACTATCT CTTTACTCAG CATTTGCCTA AGGACTTACG AGACCCAGAC TTGATTATCC 1020 

GTACTAGTGG AGAATTACGT TTGAGCAATT TCCTTCCATG GCAGGGAGCC TATAGTGAGC 1080 

TTTATTTTAC GGACACCTTA TGGCCTGATT TTGACGAAGC GGCCTTGCAG GAAGCTATTC 1140 

TTGCCTATAA TCGTCGTCAT CGCCGATTTG GAGGAGTTTA GGAGGAAATA TGACCCAGGA 1200 
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TTTACAGAAA AGAACCTTGT TTGCAGGGAT TGCCCTGGCT ATTTTCCTAC CAATTTTAAT 1260 
GATTGGGGGC TCTTGCTTCA GATAGCAATC GGAATCGTAG CCATGCTAGC CATGCATGAA 1320 

5 

CTTTTGAAGA TAAGAGG 1337 
(2) INFORMATION FOR SEQ ID NO: 71: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 818 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
20 (iv) ANTI-SENSE: NO 



25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

TCGGTACCCG GGGATCCTCT AGAGTCGATA GTCGCCAAGC AGAAGAAGGG AACACCATTC 60 

GTAGAAGACG TGAGTGCGAC GAATGCCAAC ACCGTTTTAC AACCTACGAA CGAGTAGAAG 120 

30 

AAAGAACCTT AGTGGTTGTT AAAAAAGATG GCACACGGGA ACAATTCTCC AGAGATAAAA 180 

TCTTTAATGG GATTATCCGC TCAGCCCAGA AACGTCCTGT GTCAAGTGAT GAAATCAACA 240 

35 TGGTGATCCT CTAGAGTCGA ACAGAAACTC CGTGGTCGAA ATGAAAATGA AATTCAAAGT 300 

GAGGACATTG GTTCACTCGT CATGGAGGAG TTGGCTGAAT TGGACGAGAT TACCTATGTA 360 

CGTTTTGCTA GTGTCTATCG TAGTTTTAAG GATGTCAGTG AGTTAGAGAG CTTGCTCCAA 420 

40 

CAAATCACCC AGTCCTCTAA AAAGAAAAAG GAAAGATAAA TGAAGCCAAT TGACCGTTTT 480 

TCTTATCTAA AGAATAATCG GGTGTCGCAA GATACCTCAT CTCTGGTACA GTGCTACCTC 540 

45 CCGATTATCG GTCAGGAGGC ACTGAGCCTT TATCTTTATA CGATTAGTTT TTGGGATAAT 600 

GGTAGAAAGG AATATCTTTT TTCAAGTATC CTCAATCATC TTAACTTTGG AATGGATAGA 660 

CTGATAAAAT CATTGAAAAT CTTATCTGCT TTTAATCTCT TGACTCTCTA TCAAAAGGGG 720 

50 

GATGTTTATC AGCTAGCCCT CCATGCTCCT CTATCTAGTC AAGACTTCTT GGGGCATCCT 780 

GTTTATCGCA GACTCTTAGA GAAAAAGATT GGGGACAA 818 
55 (2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 746 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 



50 



55 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
TTACCCGGGG ATCCTCTAGA GTCGATATGC TCTCTGAGGG TCAATTCCTC ATACAGACTA 60 
15 GGCGTCTCAG GAATGTAGCC AATCTGCTTG CGGTAGCTAG TCGCATCTCC TTGCAGAGTC 120 
AGGCCATTGA TATTGATGGA GCCACTATAA GGTGCCAACA GACCGATAAT CTCATTGATC 180 
GTCGTTGATT TCCCAGCACC ATTGAGACCA ATCAAACCGA CCAACTGCCC ACTTTCAACA 240 

20 

GTAAAGGACA CATCTTTCAA AACAGGAACA TGAACATAGC CACCTGTCAG GTTTTTAATT 300 
TCTAACATAT TTTCTCCAAA TCTGGTATAA TGTAGCTATA TTATATCAAA ATTCAGTACA 360 

25 GTAGAGGTAG ATTTTATGTC AGATTGCATT TTTTGTAAAA TCATCGCAGG GGAAATTCCT 420 
GCTTCGAAAG TATATGAAGA TGAGCAGGTC CTTGCCTTTC TTGATATCTC TCAAGTAACA 480 
CTAGGACACA CCTTGGTCGT GCCAAAAGAA CACTATCGCA ATCTTTTGGA GATGGATGCT 540 
ACGAGCGCCA CCAACTCTTT GCCCAAGTAC CAAAAGTAGC TCAAAAAGTC ATGAAAGTCA 600 
CTAAGGCTGC TGGTATGAAT ATCATTTCCA ACTGTGAAGA AGTCGCTGGT CAAACAGTTT 660 

35 TTCATACTCA CGTTCACCTT GTGCCTCGCT ACAGTGCTGA CGATGACCTC AAGATTGATT 720 
TTATCGCCCA CGAAACAGAC TTTGAC 746 
(2) INFORMATION FOR SEQ ID NO: 73: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 767 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
45 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
GATCCAAGCA GTCCGTGATG TAAGCTTTGA AGTTAATGAA GGAGAAGTTG TTTCCCTTAT 60 
60 CGGTGCCAAC GGTGCAGGTA AGACAACTAT TCTTCGCACC TTGTCAGGTT TGGTTCGACC 120 
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AAGTTCAGGA AAGATTGAAT TTTTAGGTCA AGAAATCCAA AAAATGCCAG CTCAGAAAAT 180 

TGTGGCAGGT GGTCTTTCAC AAGTTCCAGA AGGACGCCAC GTCTTTCCTG GCTTGACTGT 240 

TATGGAAAAT CTTGAAATGG GAGCTTTCTT AAAGAAAAAT CGTGAAGAAA ATCAAGCTAA 300 

CTTGAAGAAG GTTTTCTCAC GCTTTCCTCG TCTTGAAGAA CGTAAGAACC AAGATGCAGC 360 

TACTCTTTCA GGAGGGGAAC AACAAATGCT TGCCATGGGA CGCGCTCTTA TGTCAACACC 420 

AAAACTTCTT CTTTTAGATG AACCATCAAT GGGACTTGCC CCAATCTTCA TCCAAGAGAT 480 

TTTTGATATC ATTCAAGATA TTCAGAAGCA AGGAACAACC GTCCTCTTGA TTGAACAAAA 540 

15 TGCCAATAAA GCACTTGCAA TCTCTGACCG AGGATATGTA CTGGAACAGG GAAATCGTCT 600 

ATCAGGGACA GGGAAAGACT CGCTCATCAG AGGAGTCAGA GCATATCTAG GTGGTAAACA 660 

TCCAGTGGAT TTTTGTCGGC AGTGAGTTCG GGATCATCAT TTAGTTGGGG CTTGTTAGGT 720 

TCAGTAAGTC GGTTATCAAA TCAGGGTTGT TTGCCGCAGT GGGGTCG 767 
(2) INFORMATION FOR SEQ ID NO: 74: 



10 



20 



25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 695 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
35 (iv) ANTI-SENSE: NO 



40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

GAGCTCGGTA CCCGGGGATC CTCTAGAGTC GATAATTCGT TGGTTGGACG AACCTCGAAA 60 

CTGGAGCATG AGATTTCTCT TAGTTCGATC ATATCTTCCA TCGACAAGAA TGTCAATCAA 120 

45 

TGATAAGAGT TCCAGTTTAT CTGGAGTTTC CGGGATCATT TCTTCCCAAG TGTAGCCCGT 180 

CCAGGACCAA ATGTCCTTGT CTGGCAATTC CTTTCGGATG CGTTTAACTA GAGGCAAGAG 240 

50 AATGCCAGTA TTGAGAAAAG GCTCCCCTCC CAGCAAAGTC AAGCCTTGAA CATAGGGTTG 300 

GGCAAGGTCT GCCATAATCT GCTCTTCTAA TTCTGCTGTA TAGGGAATGC CAGCATTAAA 360 

AGACCAAGTC GCAACATTAT AACATCCCTC GCAGTGAAAC ATACAGCCTG ATACATAGAG 420 

55 

AGAGTTGCGC ACGCCTTCGC CGTCCACAAA GTTAAAGGCC TTGTAGTCAA TGATACGACC 480 

TTGACTAAGT TCCTCGCTTT TCCATTCTTG TGGTTTTGGA TTATTCATTC GCTACCTCTA 540 

60 TCCAATAACG CTCGACTCCA TTGCGAGCAT CCTCAAATAT TCCACCATTT GCTAGAATGA 600 
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CTGCTCTGCT AGCAGGATTA TTCACGCTAC AGGGCACCAG AGCTTTCTTG ATGTCTTTTC 660 
CCTAGCAACT TCAAGCCCTG ACGGAAGTCT TTTTT 695 
5 (2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 723 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
15 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

CTCGGTACCC GGGGATCCTC TAGAGTCGAC GGCTACAATG ATATTAAGAT GGATGATGTG 60 

ATTGACGCGT ATGTCATGGA AGAAATCAAG AGATAAGATT TTTTGCTCCT TTCTTAGGTG 120 

GTGAGGGACG CAAGCAAACC GATGGTTTCA TTGCTTATTT TTGAGCCTAG GGTCTCAAAA 180 

30 ATCCCCTGTG ATGGGACTGA TAAATCAGTT CCATCACTTT CACCACGGCG AAAGAAGCAG 240 

ATGACTTCAA ATTGAACTTC GTTTCAATTT AAACTGAAAA TCAAGAAGTT TAAAATAGCT 300 

AGGTCTGCTG GCCTAGCTTT TGGTTCAAAG TAGAGAAAGG AATATCATGG TAAATCATTT 360 

CCGTATAGAT CGTGTGGGCA TGGAAATCAA GCGTGAAGTC AATGAGATTT TGCAAAAGAA 420 

AGTCCGTGAT CCACGTGTCC AAGGTGTGAC CATCACAGAT GTTCAGATGC TGGGTGACTT 480 

40 GTCTGTTGCC AAGGTTTATT ACACCATTTT GAGTAACCTT GCTTCGGATA ACCAAAAAGC 540 

CCAAATCGGG CTTGAAAAAG CAACTGGTAC CATCAAACGT GAACTTGGTC GCAATTTGAA 600 

ATTGTACAAA TCCCAGATTT GACCTTCGTC AAAGACGAGT CCATCGAGAT GGAACCAAGA 660 

TTGACGAGAT GCTACGAAAT CTGGATAAGA CTAAAGAAGA GGGGGTTGCC CCCCTTTTTT 720 

GGG 723 
50 (2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 970 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
60 (iii) HYPOTHETICAL: NO 



35 



45 
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(iv) ANTI-SENSE: NO 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

TGTCCTTATT TGTCTGACCA AGTGCAAGCT GGTCGGATTT GTGGTAACAT TGGATAAGAT 60 

10 TTGACAAAGG AATTTCCATC ATGTAACGGT CTTACTCCAC GAAACGATTG ATATGCTTGA 120 

CGTAAAGCCT GAAGGTATCT ACGTTGATGC GACTTTGGGC GGAGCAGGAC ATAGCGAGTA 180 

TTTATTAAGT AAATTAAGTG AAAAAGGCCA TCTCTATGCC TTTGACCAGG ATCAGAATGC 240 

15 

CATTGACAAT GCGCAAAAAC GCTTGGCACC TTACATTGAG AAGGGAATGG TGACCTTTAT 300 

CAAGGATAAC TTCCGTCATT TACAGGCACG TTTGCGCGAA GCTGGTGTTC AGGAAATTGA 360 

20 TGGAATTTGT TATGACTTGG GAGTGTCTAG TCCTCAATTG GACCAGCGTG AGCGTGGTTT 420 

TTCTTATAAA AAGGATGCGC CACTGGACAT GCGGATGAAT CAGGATGCTA GTCTGACAGC 480 

CTATGAAGTG GTTAATCATT ATGACTATCA TGATTTGGTT CGTATTTTCT TCAAATACGG 540 

25 

TGAGGATAAA TTCTCTAAAC AGATTGCGCG TAAGATTGAG CAAGCGCGTG AAGTGAAGCC 600 

GATTGAGACA ACGACTGAGT TAGCAGAGAT TATCAAGTTG GTCAAACCTG CCAAGGAACT 660 

30 CAAGAAGAAG GGTCATCCTG CTAAGCAGAT TTTCCAGGCT ATTCGAATTG AAGTCAATGA 720 

TGAACTGGGA GCGGCAGATG AGTCCATCCA GCAGGCTATG GATATGTTGG CTCTGGATGG 780 

TAGAATTTCA GTGATTACCT TTCATTCCTT AGAAGACCGC TTGACCAAGC AATTGTTCAA 840 

35 

GGAGCTTCAA CAGTTGAAGT TCCAAAAGGC TTGCTTTCAT CCCAGATGAT CTCAAGCCCA 900 

AGATGGAATT GGTGTCCCGT AAGCCAATCT TGCCAAGTGC GGAAGAGTTA GAAGCCAATA 960 

40 ACCGTTGACT 970 

(2) INFORMATION FOR SEQ ID NO:77: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 954 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

55 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: 
GAAAGGAGTA ACTGATGCAC GTAACAGTAG GTGAATTAAT TGGTAATTTT ATTTTAATCA 60 
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CTGGCTCTTT TATTCTTTTG CTAGTCTTGA TTAAAAAATT TGCATGGTCT AATATTACAG 120 

GCATTTTCGA AGAAAGAGCT GAAAAAATTG CTTCAGATAT TGACAGAGCT GAAGAAGCCC 180 

5 

GTCAAAAAGC AGAAGTATTG GCTCAAAAAC GCGAAGATGA ATTGGCTGGT AGCCGTAAAG 240 

AAGCTAAGAC AATCATTGAA AATGCAAAGG AAACAGCTGA GCAAAGTAAG GCTAATATCT 300 

10 TAGCAGATGC TAAACTAGAA GCGGGACACT TAAAAGAAAA AGCCAATCAA GAAATTGCTC 360 

AAAATAAAGT AGAAGCTTTA CAGAGTGTTA AGGGTGAGGT CGCAGATTTG ACCATCAGCT 420 

TAGCTGGTAA AATCATCTCA CAAAACCTTG ACAGTCATGC CCATAAAGCA CTCATTGATC 480 

15 

AGTATATCGA TCAGCTAGGA GAAGCTTAAT GGACAAGAAA ACAGTAAAGG TAATTGAAAA 540 

ATACAGCATG CCTTTTGTCC AATTGGTACT TGAAAAAGGA GAAGAAGACC GTATCTTTTC 600 

20 AGACTTGACT CAAATCAAGC AAGTTGTTGA AAAAACAGGT CTGCCTTCTT TTTTAAAACA 660 

AGTGGCAGTA GACGAGTCGG ATAAGGAAAA AACAATTGCT TTTTTCCAAG ATTCTGTGTC 720 

ACCTTTATTA CAAAACTTTA TCCAGGTTCT GGCCTACAAT CACAGAGCAA ATCTTTTTTA 780 

25 

TGATGTGCTT GTAGATTGCT TGAACCGACT TGAAAAAGAA ACAAATCGAT TTGAAGTGAC 840 

GATTACGTCT GCTCATCCTC TAACTGATGA ACAGAAGACT CGTTTGCTCC CTTTGATTGA 900 

30 GAAAAAAATG TCTCTGAAAG TAAGGAGTGT AAAAGAACAA ATCGATGAAA GTCT 954 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 1602 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

45 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:78: 

CCTGATTATA CCCAACCTCT TTGCATCAAG TCGGAAAAAT GAGTGAAATG GGTTTCCAGT 60 
TTTCCTGAAA TAAGGTATCC TATATAAAGT ACCCTATGAT AACCATGGAG GTATTGTGTA ' 120 

55 TGGTTCAAAC AAGTCATTGA AGAAATACAA AACAATGCCA ACATTGTGGA AGTCATAGGA 180 

GATGTGATAT CTTACAAAAG GCAGGACGGA ACTATCTAGG GCTCTGTCCT TTTCATGGTG 240 

AAAAAACACC ATCTTTCAGC GTTGTAGAGA ACAAGCAGTT TTACCACTGT TTTGGTTGTG 300 

GTCGCTCAGG TGATGTCTTT AAAATTCATC GAGGAGTACC AAGGGGTTAC CTTTATGGAG 360 



60 
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GCTGTCCAAA 


TCTTAGGTCA 


GCGTGTCGGG 


ATTGAGGTTG 


AAAAACCGCT 


TTATAGTGAA 


420 


5 


CAGAAGCCAG 


CCTCGCCTCA 


CCAAGCTCTT 


TATGATATGC 


ACGAAGATGC 


GGCTAAATTT 


480 




TACCATGCTA 


TTCTCATGAC 


AACGACTATG 


GGCGAAGAGG 


CCAGAAATTA 


CCTTTATCAG 


540 




CGGGGTTTGA 


CAGATGAAGT 


GCTTAAACAT 


TTTTGGATTG 


GTTTAGCACC 


TCCAGAACGA 


600 


10 


AACTATCTCT 


ATCAACGTTT 


GTCTGATCAG 


TATCGTGAAG 


AGGATTTACT 


GGATTCAGGC 


660 




CTGTTTTATC 


TTTCGGATGC 


CAATCAATTT 


GTAGACACCT 


TTCACAATCG 


CATTATGTTT 


720 


15 


CCCCTGACAA 


ATGACCAAGG 


AAAGGTCATT 


GCCTTCTCAG 


GTCGTATCTG 


GCAAAAAACG 


780 


GATTCACAAA 


CTTCTAAGTA 


TAAAAACAGC 


CGTTCGACTG 


TAATTTTTAA 


CAAAAGTTAC 


840 




GAATTATATC 


ATATGGATAG 


GGCAAAAAGA TCTTCTGGAA AAGCTAGTGA GATTTACCTG 


900 


20 


ATGGAAGGAT 


TCATGGATGT 


TATTGCAGCC 


TATCGGGCTG 


GAATCGAAAA 


TGCTGTGGCG 


960 




TCGATGGGAA 


CGGCCTTGAG 


TCGAGAGCAT 


GTTGAGCATC 


TGAAAAGGTT 


AACCAAGAAA 


1020 


25 


TTGGTTCTTG 


TTTACGATGG 


AGATAAGGCT 


GGGCAAGCCG 


CGACATTGAA 


AGCATTGGAT 


1080 


GAAATTGGTG 


ATATGCCTGT 


GCAAATCGTC 


AGCATGCCTG 


ATAACTTGGA 


TCCTGATGAA 


1140 




TATCTACAAA AAAATGGTCC 


AGAAGACTTG 


GCCTATCTAT 


TAACGAAAAC 


TCGTATTAGT 


1200 


30 


CCGATTGAGT 


TCTACATTCA 


TCAGTACAAA 


CCTGAAAACG 


GTGAAAATCT 


GCAGGCTCAG 


1260 




ATTGAGTTTC 


TTGAAAAAAT 


AGCTCCCTTG ATTGTTCAAG AAAAGTCCAT 


CGCTGCTCAA 


1320 


35 


AACAGCTATA 


TTCATATTTT 


AGCTGACAGT 


CTGGCGTCCT 


TTGATTATAC 


CCAGATTGAG 


1380 


CAGATTGTTA ATGAGAGTCG 


TCAGGTGCAA AGGCAGAATC 


GCATGGAAAG 


AATTTCCAGA 


1440 




CCGACGCCAA 


TCACCATGCC 


TGTCACCAAG 


CAGTTATCGG 


CTATTATGAG 


GGCAGAAGCC 


1500 


40 


CATCTACTCT 


ATCGGATGAT 


GGAATCCCCT 


CTCGTTTTGA 


ACGATTACCG 


TTTGCGAGAA 


1560 




GACTTTGCAT 


TTGCTACACC 


TGAATTTCAG 


GTCTTACATG 


AC 




1602 




(2) INFORMATION FOR SEQ ID NO: 79: 











45 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7203 base pairs 

(B) • TYPE: nucleic acid . 

(C) STRANDEDNESS : single 
50 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

^ (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

60 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
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CCTCCATCAA ATCTGAGACT GATTCAAAAG 
GCGTTAACAC TTGGAGCAAC TTCCGATTTT 

5 

GATATTGAAT TGCCTGACGA AAGGAAAATA 
TTACTAAACA CCTGTTGGGC CTGCTCCTTA 
10 ATCTTTGACA TACTCTCACC CTCTTTCCAT 
TAAATCTATT GGATATACGT TAGCCTCTTC 
ACTCATCAAA CAAACTCAAC TGGTTATCCT 

15 

CCATCTTTTC AACCAAGGTT GATGAGAGTC 
GGAATTCTCC CTCTTCACGC GCCCGCACCA 
20 CCATTGCTAC AAATGGTGGG ATAAGGGTAT 
TACGGTAGAG ATCTAATTTG CCAAACTTGA 
CAAGAGTTGT ATAGAGATCG ATTTCCACAT 

25 

AGATTTCTTC CATTCTGCGC TTGATGGCCT 
AAGCCTTAGC ACGAATGGAG AAGTAAGCAC 
30 AGTAAGCTAC ACGCAAGGCC ATCATAACGT 
ACTTAATTTT CCCACAGGAT TCGATATACC 
CGATATAGCC ATTTCTCTCC TCTTCTGAAA 

35 

CCATAATGGT AAAGGCCATC TTAGGTTCCA 
CGTCCCGACA ACCGATAACA GTCGATAGGT 
40 CATTCCCCAA CCAAACGTCA GTACCGTGGG 
AGGTTGTCGG ATGGGTTTCG TCTACCATTC 
TCCCTAACAT ACCCGTAGCG TTCCAATTTG 

45 

AGAAAAGAGT GCCATCACGC CTTCGTCATC 
CAAATCCTGA AGTTTTCGAA TCATGGTCGG 
50 GACGTTCTCA TCGATATCGT GGAAGTTAAA 
ATCTGCTGGA TACTGGACAG GCGTAAAATC 
GATTCCCCCC GGGTGTTGGC CTGTTGTCCG 

55 

TTCTACTTCT GCATCACGAT AAAACTTGCC 
ATAGGCAGTC TTGGCAGCTA CCGTACCAAC 
60 AAAGATATCA CGCACATCCA AGTGGGCGCT 



ACTGGCTCAT ATTACGATTT TGGTCTAAAT 60 

CGTCTAGTCT AACATCAAAA GGTAATCCCT 120 

TTAATAGCTG TTGTCATATC CATTCCCAGA 180 

ACCTCACTAT CCAGACGGAT GCTCATACTC 240 

AGACTATTTT AACAAAAAAG AAAGCTAATG 300 

TAATAGATTA TTAAGCAATT TTTTAAAACA 360 

CTGGCATATT TCCAAGAATA CCCATCTCAT 420 

CACCACGCTT GCGTAGTTCT GTTTTAGAGA 480 

GTTGCTTGGC AACGTTCTCT CCCAGACCAT 540 

CCCCGTCGAT GAGGAACTCT GTCGCCTGAC 600 

AACCTCGTTC CCACATCTCA TTGACAATCT 660 

TAGAGGCTTC ATTGTTCTTC CGTTTTTCAG 720 

CCAAGCCCGC ACCCATGGTC TTGATATCAA 780 

AGTAGTAATA AATAGGATGG TGAACCTTGA 840 

AGGCTGCCGC ATGGGCCTTA GGGAACATGT 900 

ACTCTGGCAC CTTATTAGCC TTCATGGCTT 960 

TCTTTAGCCA CAAACCCTTA CGTACCCGTT 1020 

GACCCGCATG CATGAGGTAA ACCATGATGT 1080 

CCGCTATTCC TTGCTTAATC AGATCCTGAG 1140 

ACAGACCAGA CAGCTGAAGC AATTCCGCAA 1200 

CACGTACGAA ATTTGTTCCA AACTCTGGAA 1260 

TTCAGGTGTT ACCCCTAGCA CATCAGTCCC 1320 

CATAGGAATT TTATTAGGGT CAATACCAGA 1380 

ATCATCATGT CCCAGTACAT CGAGTTTGAG 1440 

GTGAGTGGTC TGCCATTCAG CCGTGACATC 1500 

GTAGACATCC ATGTAGTTCG GAATAACAAC 1560 

CTTGACACCC GCCGCTCCTT GAGCGAGGCG 1620 

ATAATCTCGC TCGTAACCCT TGACAAATCC 1680 

TGTTCCCGCA CGGAAGGCAT ATTCTTCACC 1740 

AGGCTGATCT TCTCCCGAGA AGTTCAAGTC 1800 
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AATATCAGGA ACCTTATCCC CATCAAAACC AAGGAAGGTC TCAAACGGAA TATCCTGTCC 18 60 

GTTTTTACTG AGTTTGTGAC CACAGTTTGG ACAGTCCTTA TGGGGCATAT CAAATCCTGA 1920 

5 ACCGTACGAA CCATCTGTGA TAAACTCACT GTACTGACAC TGACCACAGA CATAGTGAGG 1980 

AGAGAGAGGA TTGACCTCCG TAATCCCAAT CAT GGTCGCA ACGAAACTAG ATCCGACAGA 2040 

CCCACGAGAA CCAACCAAAT AACCCCGTTC ATTAGAACGT TGCACCAGCA TCTGCGATGC 2100 

10 

CAGATAAATC ACAGCAAATC CATTCCCCAG TATGGATGTT AATTCTTTTT CAATCCGCAA 2160 

ATCAACAATA TCTGGCAGCG GATTTCCATA AATCTCAAAA GCTTTCTTAT AGGTCAACTC 2220 

15 AGCAACTGTT TCTTCAGCCT TGTCGATGAA AGGCGTATAC AAGTCACCCT TAACGACTTC 2280 

AACGGGTTCA AATATTTCTG CCAAGGCATT GGTGTTTTCA ATAACCAGTT TACGAGCCAG 2340 

TTCCTCTCCC AAAAAGGCAA ATTCATCCAA CATCTCATTA GTCGTTCGAA AATGAGCCTT 2400 

20 

TGGAAGTGGT GCTGGTTGGG CATGTTCACC ATGACCGATA GTTCGGTTAA TCATCGCACC 24 60 

CTGTCCCAAA CTACGGACGA TAATTTCACG ATAAATCTCT TCTTCCGGTT CGATATAGTG 2520 

25 AACATTTCCC GTAGCCAAAA CAGGCTTGCC AAGGCGGTCT CCAACCTCTA TCAAACTCTT 2580 

GATAATGGTC TGGAGTTCCT CCATATCCTT GACCTGCTCT TTAGCAATCA AGGGCGCATA 2640 

GATAGCCGGT GGCATGACCT CGATAAAGTC ATAATACTTG GCCACCTCAA CCGCCGCATC 2700 

30 

CACACCTTGA GAAACGACCA CGTCAAAAAC TTCACCCTCT GAACAGGCTG AACCTAAAAT 2760 

CAAGCCCTCT CGATGGGCAT CTAGAACCGT TCTCGGAATC CGTGACACTC CTTCAAAATA 2820 

35 CTTGGTATTA GACAAGGAAA CCAGCTTAAA GATATTTTTT AGACCTACCT GATTCTTGAC 2880 

ATAGATGGTC GCATGCTTGA TCCGAGCTTT TTTGTAAGAA TCTGGACTGA TTAGATCAAT 2940 

GTTGAGTCTA GCTAAATCGG TCACACCATG TTTTTCTGCT ACCTCTTTGA TAAAGATAAA 3000 

40 

GCAGACGACC AGTCGCTTCC GCATCGTATT GGCCATGTGG TGATGTTCCA AGCCACACCA 3060 

AAACGCTTGG TCAAAGGCCC AAACCATGAT TTATACTCAG GATAGAGGTT TCAGCAAACT 3120 

45 CCAGGTATCA ATAACGGCTG ACTAATCTTT GGCCATGACG CTCATAATTA GCATTCATAA 3180 

AGCCAACGTC AAAGGTAGCA TTGTGGGCAA CTAGGACCGT ATCCTTGCAA ATTCTTGGAA 3240 

TTCTTGCAAA ACTTGTTCTA GTGGTTTGGC ATT7TTGACA TGATCATCTG TAATTCCAGT 3300 

50 

TAACTCTGTA GTAAAAGCTG ACAAGGGATG CCCAGGATTG ATAAATTCAT CAAATTCAGC 3360 

AATAACATTC CCCTTGTACA TCTTAGAGGC CGCAACCTGA ATCAAGTCAT TATAGATAGC 3420 

55 TGAAAGTCCC GTCGTTTCCA CGTCAAAGAC CACGTAGGTT GCTTCTGATA AGTCCATCTA 3480 

CATTCGTTAT AGACGATAGG GACACGGTCC TCCACGATAT TGGCTTCTAT CCCATAGATC 3540 

AGCTGGATTC CCGCTTTCTT AGCCGCCTTA TAGCCATGTG GAAAGGACTG GACATTCCCA 3600 

60 

TGGTCTGTGA TAGCAACCGC CTTGTGTCCC CACTTAGCAG CTGTTGCAAC AATCTCTTCG 3660 
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ACCTCTGGCA AAGCATCCAT 


AGTCGACATG 


TTAGTATGAG 


CATGAAACTC 


AACCCGACGC 


3720 


TCACCTTCTG 


GCATCAAATC 


CTTCCGCTCA 


TAGTGAACAA 


CTTCCTGCAG 


ATCCTGTACG 


3780 


TTCATAGTCA AATCGCGTGT 


GAAGTTATTC 


ATCTCCACAT 


TCCCTCGAAC 


TCGGAGCCAA 


3840 


GAATTCTTCT 


TGATGAGGTC 


AAACTTCTGG 


GCCTCTTCCT 


CGTTTTTAAC 


CCACTTTTGC 


3900 


ATAGAAAAAC 


TTGAAGTATA 


GTCCGTCATT 


TTAAAGTTGA 


TTAAAACACG 


ACCTGTTCTA 


3960. 


GTCACTTTTT 


GCTCCACATC 


AAAAACAACC 


CCTTCAAATA 


CCAGACGATT 


TTCCTCTGTC 


4020 


GTCACTTCGA 


TCATAGGAGT 


AATCTCCGCC 


TTATCCAGCT 


TGGGTTTAGC 


TGCAGCTTTT 


4080 


TTCGCTTGAA 


AATCAAAGAC 


TGGTTTCTCT 


TCCGCTGGAG 


GAGGTGCCAT 


CTGCTCCAGT 


4140 


TGTTCCATAG 


CACGGAGCGC 


TTCCTCATTG 


GCAGCTTGAA 


CAATCTGCTC 


ATTTTCAGCA 


4200 


TGAAAGGCCT 


CTTCCTGCTC 


TTGGGTCAGG 


ACATCATTCT 


TCTCGACTTG 


ACAGTTAAAA 


4260 


GTTGGAAAAC 


CAAACTTTTC 


AAGTTGTTTG 


GCTAAATTAG 


GAAGATGATT 


CTTCTTAAAA 


4320 


TGTTCCTTAT 


CAATCGCTTC 


AGATCCTTCA ATAAATAGCT 


GATTACCCTC 


AGCACGAACT 


4380 


TGCAAATTTT 


GATAAAGGGA 


CTTAAAACCT 


TGACTAGCAC 


ATGGACCTTC 


AGAGAAAGCC 


4440 


TCCCTATAGT 


AGGACTGCAA 


GAGCTGATTT 


GAAAATTCTT 


GAGACCGAGC 


CTTAATTTCA 


4500 


AAAACAGCTT 


TATTGCCTGT 


CTTAGAAAAT 


TCTTCGCTGA 


AACCTTTCTT 


TAATTCTAAA 


4560 


AAGATTTCAA 


TCGGTAAAAT 


ATTAGAAAAT 


ACGAAATGAA 


ACTCCCATAC 


CTTACTAATT 


4620 


TTATGAACCA 


CAACTCGCTC 


AATATTGGCC 


TGTGCTAAAG 


CAGGAGCCTG 


TCTCATTTCA 


4680 


GCAGGCATCC 


CCAATTGATT 


CATCAAAATT 


TCAAAACTAT 


TTGACATTCA 


TTTTCCTCAC 


4740 


ATTATTCTTC 


TACTATTTTA 


CCATATTTAG 


AGGTATTTTC 


TAAAGACAAA AGGAAGCCAC 


4800 


TAAGTGACTT 


CCTTCTAGAG 


TGAGGACGGA 


TTAGTCTTCA 


CCTTTATTTT 


TCTTAATAAT 


4860 


TTCTTCTTGT 


ACTGACTTAG 


GTACATCTTC 


GTAGTGGTCA AATACCATCA 


TGAATGTACC 


4920 


ACGTCCTTGA 


GATGCAGAAC 


GAAGAACTGT 


TGCGTAACCG 


AACATTTCAG 


CAAGTGGAAC 


4980 


GTAAGCACGA 


ACGATTTGGC 


TGTTACCGTG 


TGCTTCCATA 


CCATCTACAC 


GTCCACGACG 


5040 


AGCAGTTACG 


TGACCCATAA 


CATCACCAAG 


GTTTTCTTCT 


GGAACAGTGA 


TTGTTACAAG 


si nn 

•J X\f\J 


CATCATTGGT 


TCAAGGATAG 


CTGGTTGTGC 


TGATTTAGCA 


GCTTCTTTAA 


GGGAAAGTGA 


5160 


AGCCGCAATC 


TTGAAGGCAG 


TTTCAGATGA 


GTCGACATCG 


TGATATGAAC 


CATCATAAAG 


5220 


CTTAGCTTTA ACGTCAACCA 


TTGGGTAACC 


TGCAAGAACA 


CCGTTAGCCA 


TAGATTCTAC 


5280. 


CAAACCTTTT 


TCAACCGCTG 


GGATAAATTC 


ACGAGGAACC 


ACACCACCGA 


CGATTGCGTT 


5340 


TTCGAATTCG AATCCTTTAC 


CTTCTTCGTT 


TGGAGTAAAT 


TCAATCCATA 


CATCACCGAA 


5400 


TTGACCTTTA 


CCACCAGACT 


GACGTTTGAA 


GAATCCGCGT 


GCTTGAGTAG 


AAGCGCGGAA 


5460 
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TGTTTCACGG TAAGATACTT GAGGAGCACC TACGTTCGCT TCAACTTTGA ACTCACGACG 5520 

CAT AC GAT CA ACAAGGACGT CAAGGTGAAG TTCACCCATA CCTGAGATAA CTGTTTCACC 5580 

AGTTTCAACG TTTGTTTCAA CGCGGAATGT TGGATCTTCT TCAGCCAATT TTTGAAGGGC 5640 

GATACCCATC TTGTCTTGGT CAGCTTTAGA TTTTGGCTCA ACCATCAATT GGATAACTGG 5700 

TTCTGGAACG TTGATTGACT CAAGGATGAT TTTAGCTTTT TCATCTGTCA ATGAGTCACC 5760 

AGTTGTAGTA TCTTTCAAAC CAACGGCAGC AGCGATATCA CCTGAGTAAA CAGTGTCGAT 5820 

TTCTTGACGG CTGTTAGCGT GCATTTGAAG GATACGTCCG ATACGTTCAC GTTTACCTTT 5880 

15 AGAAGTATTC AATACGTATG AACCTGATTG AAGAACACCT GAGTAAACAC GGAAGAATGT 5940 

CAAACGACCT ACGAATGGGT CAGTCATGAT CTTGAAGGCA AGAGCTGCAA ATGGCTCTTC 6000 

GTCAGATGCT GGACGAATTT CTTCAGCGTC TGTATCTGGG TTAATACCTT TGATTGCTGG 6060 

20 

GATGTCAAGT GGACTTGGAA GGTAGTCGAT AACCGCATCA AGCATCAATT GAACACCTTT 6120 

GTTTTTGAAG GCTGAACCAC ACAATACTGG GAAGAATTCA ACGTTGATAG TCGCTTTACG 6180 

25 GATACCAGCT TTCAATTCTT CGTTAGTGAT TTCTTCACCT TCGAGGTATT TCATCATCAA 6240 

TTCTTCGTCA GTTTCAGCAA CTGCTTCAAT CAATTTTTCA CGGTATTCTT GAGCTTGGTC 6300 

AAGGTATTCA GCTGGGATGT CTTCTTCAAG GATATCCGTA CCAAGGTCGT TAGTATAGAT 6360 

30 

TTCAGCTTTC ATCTTGATCA AGTCAATGAT ACCACGGAAG TCATCTTCAG AACCGATTGG 6420 

CAATTGGATT GGGTGTGCAT TTGCTTGAAG ACGATCGTGA AGTGTGCTTA CAGAGTAAAG 6480 

35 GAAGTCAGCA CCGATTTTGT CCATTTTGTT GGCAAATACG ATACGTGGAA CTCCGTACTC 6540 

AGTTGCTTGA CGCCAAACTG TTTCAGTTTG AGGCTCAACA CCTGATTGTG AGTCAAGAAC 6600 

GGTAACCGCA CCATCCAATA CACGAAGAGA ACGTTGTACT TCGATTGTGA AGTCCACGTG 6660 

40 

TCCTGGTGTG TCGATGATGT TTACGCGGTG GTTGTTCCAT TGAGCTGTTG TCGCAGCAGA 6720 

TGTGATCGTG ATACCACGTT CTTGCTCTTG CTCCATCCAG TCCATTTGTG ACGCACCTTC 6780 

45 GTGAGTTTCA CCGATTTTGT GGATTTTACC AGTGTAGTAA AGAATACGCT CAGTAGTTGT 6840 

TGTTTTACCG GCATCGACGT GAG C CAT GAT ACCGATATTA CGAGTTTTTT CAAGTGAAAA 6900 

TTCGCGTGCC ATGAGGTTTG TTTCTCCTAT TTATTTTTGA TTTCTATTCT ATTATAACAC G960 

50 

GATTTTAATA AAAACGGATA GGCAGGACCT ACCCGTTCTC AATGTTTTCA TGCTATTGTT 7020 

GGTTTCAACT TACGAGATGG TAAGCTGAGT TATAGCTAAT ACTAATCGAT TTAGCTAATT 7080 

55 TGAACCCGGG CTAAAGTTAG TTAGCCGATA TGAGCTGGAA CGGGATGCTG CGCGAAAAAG 7140 

ATAAAACTCC TTGTATTCAT CGAATACTGC GTCAGTTTCC TATTTTCACC TTGCATCCTT 7200 

ACC 7203 
(2) INFORMATION FOR SEQ ID NO: 80: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1581 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

10 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

GCATACAAAG GGCATCAAGA ATATCCGTGT TGCCACAAGC TGCGAGAAAG ATTTATGCCT 60 

20 

ACCGCCGTTA TGACCTTAAT GAATCTCCAA AGACCGCTTT AGACCTTATC ATCCCAGATT 120 

TGTTTTTGCA TATTTTGAAC CCTGCTGAAC GTGAAAGAAA ATTAAAGCGC GAAGGTGTAG 180 

25 AAGAATTATA TCTCCTTGAT TTTAGTAGTC AATTCGCTAG TCTCACTGCA CAAGAATTCT 240 

TTGCAACTTA TATCAAGGCT ATGAATGCCA AAATTATTGT TGCAGGTTTT GATTATACAT 300 

TTGGTTCTGA CAAAAAAACA GCAGAAGATT TAAAGGATTA CTTTGATGGA GAAGTTATCA 360 

30 

TTGTTCCACC TGTAGAAGAT GAGAAAGGAA AGATTAGTTC AACTCGTATC CGTCAAGCTA 420 

TTTTAGATGG AAATGTGAAA GAAGCAGGAA AACTTTTGGG GGCACCGCTT CCATCAAGAG 480 

35 GTATGGTAGT TCATGGTAAT GCTCGTGGTC GTACAATTGG TTATCCGACA GCGAATTTAG 540 

TGCTTTTAGA CCGTACTTAT ATGCCAGCAG ATGGCGTTTA TGTCGTTGAT GTTGAGATTC 600 

AAAGACAGAA GTATCGTGCT ATGGCTAGTG TCGGGAAAAA TGTGACCTTT GATGGAGAAG 660 

40 

AAGCACGTTT TGAAGTCAAT ATTTTTGATT TTAATCAAGA TATTTATGGG GAAACCGTCA 720 

TGGTTTATTG GCTTGATCGC ATTCGTGATA TGACCAAATT TGACTCAGTT GACCAATTAG 780 

45 TGGATCAGTT AAAGGCTGAT GAAGAAGTAA CTCGGAATTG GTCTTAAGAG CTTGAGTAAA 840 

TAAAACAAAA AAGAGGTTGT CTGTAACCCA AAAGATAGAT GATTTAGTCT AACTTTTGAG 900 

GTCACGACAT TACCTCTTTT TATTCTTTTT CAAAGGTGAA GCCTTCTCCT AG GAT TT CAT 960 

50 

GGGCTTCTGT AATAGTTATA AAGGCTTGAG GATCGATTCG ATGAATCATT TCCTTCGTTT 1020 

TCACAATTTC ATTTCTTCCG ACAATACAGT AGATGATTTT CAAATTTTCT TTACTATAGT 1080 

55 AGCCTTGACC AGAAATAAAA GTAACACCTC TTCCGAGGTC ATCATTAATC GCCTTAGCAA 1140 

GTTGGTCAGG ACGTTTTGTG ATAATCATAA AGCCTTTGCC GGCATATCCT CCTTCACCAA 1200 

TCAAATCAAT AACACGAGAA ACAATAAAAT CAAACAAAAG CGTGTAGGAA ACCAATCTCA 1260 

AATCCTTGAA GATTAGGAGA ATCAACATGA GAATACAAAA ATCTAAGATA AAGAGCAGTT 1320 
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TTCCTATGGA TATATGAGTG TATTTGTTGA GAATACGAGC TAGAATATCA GTTCCGCCAG 1380 

TTGTACCTCC AGCATTAAAA ATAATTCCAA GGCCAATTCC CAATAGGATT CCCGCTATAA 1440 

5 

GGGCTGTGAT TAGTAAATCA CCTTGAAGAT CAATATGAAG GGGAATATGC TCAAAAAAAG 1500 

CTAACCAGGC GGACAAAGCT AAGGTTCCTA GTAAACTAGA ATAGAGGGAT TTGGCTCCAA 1560 

10 AGATCTTCCA AGCTAGGATG A 1581 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 879 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:81: 

CAAGTTTGTC GAATTGCCAA ACACAGTTGA AGGCTTGATT CACATCACTA ATCTACCTGA 60 

ATTTTATCAT TTCAATGAGC GTGATTTGAC TCTTCGTGGA GAAAAATCAG GTATCACTTT 120 

35 CCGAGTGGGT CAGCAGATCC GTATCCGTGT TGAAAGAGCG GATAAAATGA CTGGAGAGAT 180 

TGATTTTTCA TTCGTACCTA GTGAGTTTGA TGTGATTGAA AAAGGCTTGA AACAGTCTAG 240 

TCGTAGTGGC AGAGGGCGTG GTTCAAATCG TCGTTCGGAT AAGAAGGAAG ACAAGAGAAA 300 

40 

ATCAGGACGC TCAAATGATA AGCGTAACAT TTCACAAAAA GACAAGAAGA AAAAAGGAAA 360 

GAAACCTTTT TACAAGGAAG TAGCTAAGAA AGGAGCCAAG CATGGCAAAG GGCGAGGGAA 420 

45 AGGTCGTCGC ACAAAATAAA AAGGCACGCC ACGACTATAC AATCGTAGAT ACGCTAGAGG 480 

CAGGGATGGT CCTGACTGGA ACTGAAATCA AGAGTGTACG AGCTGCTCGA ATTAATCTCA 540 

AGGATGGCTT TGCTCAAGTG AAAAATGGAG AAGTTTGGCT GAGTAATGTT CATATCGCGC 600 

50 . 

CTTACGAAGA GGGCAATATC TGGAACCAGG AACCAGAACG TCGTCGTAAA CTCCTGCTCC 660 

ATAAAAAGCA AATTCAAAAA TTGGAACAAG AGACCAAAGG GACAGGAATG ACCTTAGTTC 720 

55 CCCTTAAGGT CTATATAGAT GGCTACGCTA AGCTTCTTTT AGGACTTGCC AAGGGAAGCA 780 

TGACTATGAC AAACGGAGTC TATCAAACGT CGTGAGCAAA TCGAGATATC GCGCGTGTGA 840 

TGAAGCTGTT AATCAGCGAT AAAGAGAGGA ATTGAGATG 879 
(2) INFORMATION FOR SEQ ID NO: 82: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1550 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

10 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

AAAAGCTTAA TAAATCAATA ATTTCTTCTT TTATCCCCAA CCTGTGGATA AAGTTTGGTA 60 

20 

ACATTGTGGA TTATTTTTCA CAGCTTGTGG AAAATTCTTG CTATCTATGG TAAAATATCT 120 

CTAGTATTAA ACTTTTAAAT AGTAAAGGAG GAGAAAGGAT TGAAAGAAAA ACAATTTTGG 180 

25 AATCGTATAT TAGAATTTGC ACAAGAAAGA CTGACTCGAT CCATGTATGA TTTCTATGCT 240 

ATTCAAGCTG AACTTATCAA GGTAGAGGAA AATGTTGCCA CTATATTTCT ACCTCGCTCT 300 

GAAATGGAAA TGGTCTGGGA AAAACAACTA AAAGATATTA TTGTAGTAGC TGGTTTTGAA 360 

30 

ATTTATGACG CTGAAATAAC TCCCCACTAT ATTTTCACCA AACCTCAAGA TACGACTAGC 420 

TCACAAGTTG AAGAAGCTAC AAATTTAACT CTTTATGACT ATAGTCCAAA GTTAGTATCT 480 

35 ATTCCTTATT CAGATACGGG ATTAAAAGAA AAGTATACCT TTGATAACTT TATTCAAGGG 540 

GATGGAAATG TTTGGGCTGT ATCAGCCGCT TTAGCTGTCT CTGAAGATTT GGCTCTGACC 600 

TATAACCCTC TTTTTATCTA TGGAGGACCA GGCCTTGGTA AGACTCACTT ATTAAACGCT 660 

40 

ATTGGAAATG AAATTCTAAA AAATATTCCT AATGCGCGTG TTAAATATAT CCCTGCCGAA 720 

AGCTTTATTA ATGACTTTCT TGATCACCTA AGACTTGGGG AAATGGAAAA GTTTAAAAAG 780 

45 ACCTATCGTA GTCTTGATCT TTTGTTAATC GATGATATCC AGTCACTCAG CGGAAAAAAA 840 

GTCGCAACTC AGGAAGAATT TTTCAATACC TTTAACGCCC TTCATGACAA GCAAAAACAG 900 

ATTGTCCTAA CGAGTGATCG TAGTCCAAAA CATCTAGAAG GGCTCGAGGA GAGGCTTGTC 960 

50 

ACGCGTTTTA GTTGGGGATT GACACAAACT ATCACACCCC CTGACTTTGA AACACGTATT 1020 

GCCATTTTAC AAAGTAAAAC GGAACATTTA GGCTACAATT TCCAAAGTGA TACTCTAGAA 1080 

55 TACCTAGCTG GGCAATTTGA TTCAAATGTT CGAGATCTTG AGGGAGCCAT CAACGACATC 1140 

ACTTTAATTG CCAGAGTAAA AAAAATCAAG GATATCACTA TTGATATTGC TGCAGAAGCC 1200 

ATTAGAGCCC GCAAACAAGA TGTTAGCCAA ATGCTCGTCA TCCCAATTGA TAAAATCCAA 1260 

60 

ACTGAAGTTG GTAACTTTTA TGGTGTTAGT ATCAAAGAAA TGAAGGGAAG TAGACGCCTT 1320 
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CAAAATATTG TTTTGGCCCG TCAAGTAGCC ATGTATTTAT CTAGAGAACT AACAGATAAT 1380 
AGTCTTCCAA AAATTGGGAA GGAATTGGGG GAAAAGTCAT ACCACAGTCA TTCATGCCCA 1440 
TGCCAAAATA AAATCTTGAA TTGATCAAGA CGATAATTTA CGTTTAGAAA TTGAATCATC 1500 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1292 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

GGTATGCGCC AAAACTTCTT ATCAAAAAGA ATCTAGCCAA AGAAGCGACT GCTCAAGCTG 60 

TAGGTGAACT TCGTGGTAAA CAAAAATCGG AAGAAAAAGC TCACGCTGAG ATGATTGCAG 120 

AAGGAAAAGC AATTAAAGCA CAACTTGAAG CAGAAGAAAC TGTTGTAGAA TTTGTTGAAA 180 

AAGTTGGTCC AGATGGTCGT ACCTTTGGTT CTATTACCAA TAAGAAGATT GCAGAAGAAT 240 

TGCAAAAGCA ATTTGGAATT AAGATTGATA AACGTCATAT TCAAGTACAA GCTCCGATTC 300 

GAGCGGTTGG TTTGATTGAT GTGCCAGTGA AAATCTATCA AGATATCACA AGTGTAATCA 360 

ATCTTCGTGT GAAAGAAGGA TAAGTTTACA CCTTCTTGAC AAGATTGTAA AAGGAAGGGA 420 

AGTCTGATGG CAGAAGTAGA AGAGTTACGA GTACAACCTC AAGATATCTT AGCTGAGCAA 480 

TCCGTTTTAG GGGCTATCTT TATTGATGAG AGTAAACTTG TTTTTGTGCG AGAATACATT 540 

GAGTCTCGGG ACTTTTTTAA GTATGCCCAT CGTTTGATTT TCCAAGCCAT GGTCGATTTA 600 

TCCGATCGTG GTGATGCCAT AGATGCAACA ACGGTTCGTA CTATCCTTGA TAATCAAGGT 660 

GATTTACAGA ATATTGGTGG CTTGTCTTAC TTGGTTGAGA TTGTTAATTC TGTGCCAACT 720 

TCTGCTAATG CGGAGTATTA TGCTAAGATT GTTGCAGAAA AAGCAATGCT ACGTCGTTTA 780 

ATTGCCAAGT TGACAGAGTC TGTCAACCAA GCTTACGAAG CGTCACAACC AGCTGATGAA 840 

ATTATTGCTC AGGCAGAAAA AGGGTTGATT GATGTCAGTG AAAATGCAAA TCGAAGCGGG 900 

TTTAAGAACA TTCGAGATGT GTTGAATCTC AACTTTGGAA ATCTGGAAGC TCGCTCGCAA 960 

CAAACGACCG ATATTACAGG TATTGCGACA GGTTATCGTG ATTTGGATCA TATGACAACA 1020 



AAAAGGAAAA TCAAATAATT TGTGGAAACT TTAGGTTTTT ACCTTTTAGC 
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GGACTTCATG AGGAGGAGTT GATTATCTTA GCAGCTCGTC CAGCAGTTGG TAAGACAGCA 1080 

TTTGCCTTGA ATATCGCTCA GAATATTGGG ACTAAGTTGG ACAAAACGGT TGCTATTTTT 1140 

5 

TCACTCGAAA TGGGTGCGGA AAGCTTGGTA GACCGTATGT TAGCTGCAGA AGGCTTGGTG 1200 

GAGTCACATT CTATCCGTAC AGGGCAATTG ACAGATGAGG AGTGGCAAAA ATATACTATT 1260 

10 GCTCAGGGTA ATCGTACTAA CGCCAGTATC TA 1292 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 1876 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

AGGCTGATCC TGCGGTTATT GACCCATGGA AGAGCACCTA GAAGGAGATC ATTCCCAGAC 60 

GATATTTTAG TTTTATCCTA GTAGCCTTCC CTGGCTATTT TAGGAGCTCG TCTCTACTAT 120 

35 GTATTTCCGA TTTGATTACT AT AGT CAGAA TTTAGGAGAG ATTTTTGCCA TTGGAATGGT 180 

GGTTGGCCAT TTACGGTGGT TGATAACTGG GGCTCTTGTG CTCTATATCT TTGCTGACCG 240 

TAAACTCATC AATACTTGGG ATTTTCTAGA TATTGCGGCG CCTAGCGTTA TGATTGCTCA 300 

40 

AAGTTTGGGG CGTTGGGGTA ATTTCTTTAA CCAAGAAGCT TATGGTGCAA CAGTGGATAA 360 

TCTGGATTAT CTACCTGGCT TTATCCGTGA CCAGATGTAT ATTGAGGGGA GCTACCGTCA 420 

45 ACCGACTTTC CTTTATGAGT CTCTATGGAA TCTGCTTGGC TTTGCCTTGA TTCTGATTTT 480 

TAGACGGAAA TGGAAGAGTC TCAGACGAGG TCATATCACG GCCTTTTACT TGATTTGGTA 540 

TGGTTTCGGT CGTATGGTCA TCGAAGGTAT GCGAACAGAT AGTCTCATGT TCTTCGGACT 600 

50 

TCGAGTGTCC CAATGGCTGT CAGTTGTCCT TATCGGTCTC GGTATAATGA TCGTTATTTA 660 

TCAAAATCGA AAGAAGGCCC CTTACTATAT TACAGAGGAG GAAAACTAAA TGTTAGAAGT 720 

55 TGCATATATT CTTGTTGCCC TAGCTTTGAT TGTCTTTTTG GTCTATCTGA TCATTACTGT 780 

ACAAAAGCTT GGTCGTGTCA TCGATGAAAC AGAAAAGACG ATTAAAACCT TGACTTCAGA 840 

TGTGGATGTG ACCTTGCATC ACACCAATGA GTTGTTGGCT AAGGTCAATG TCTTGGCAGA 900 

TGATATCAAT GTCAAGGTGG CTACGATTGA TCCACTCTTC AGTGCTGTTG CAGATTTATC 960 
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TCTATCTGTT TCAGACCTCA ATGACCATGC GCGTGTCTTG AGCAAGAAAG CTTCATCAGC 1020 
TGGTTCAAAA ACACTCAAGA CTGGTGCAAG TCTGTCAGCT CTTCGTCTTG CAAGTAAATT 1080 

5 

TTTCAAAAAA TAAAAAAGGA GAATCCTTAT GGGTAAATTA TCCTCAATCC TTTTAGGAAC 1140 
GGTTTCAGGT GCAGCTCTTG CCTTGTTTTT AACAAGTGAT AAGGGCAAAC AAGTTTGCAG 1200 
10 TCAGGCTCAA GATTTTCTAG ATGATTTGAG AGAAGATCCG GAGTATGCCA AGGAGCAAGT 1260 
CTGTGAAAAA CTGACAGAAG TTAAGGAGCA GGCTACAGAT TTTGTTCTGA AAACAAAAGA 1320 
ACAGGTTGAG TCAGGTGAAA TCACTGTGGA CAGTATACTT GCTCAAGCTA AATCCTATGC 1380 

15 

TTTTCAAGCG ACAGAAGCAT CAAAAAATCA ATTAAATAAT CTCAAGGAAC AATGGCAAGA 1440 
AAAAGCCGAA GCTCTTGATG ACTCAGAAGA GATTGTGATT CATATAACAG AAGAATAAAC 1500 
20 CATCACCATC TCCGGACGGA CTATGTATCT GGGGATGGTG ATTTTTATCT GGAATCTAGT 1560 
CTTTGTGGTA TAATAATTAC TAT GCAGAAA AAACCAACGT CAGCCTATGT GCACATCCCA 1620 
TTTTGTACCC AGATTTGTTA TTATTGTGAT TTTTCAAAGG TCTTCATCAA AAATCAGCCA 1680 

25 

GTCGACAGCT ATTTAGAGCA TCTGCTGGAA GAGTTTCGTT CTTATGATAT TGAAAAGTTG 1740 
TCAACCCTTT ATATCGGTGG TGGAACACGA CAGCCCTGTC GGCTCCGCAA CTGGAGGTGT 1800 
30 TACTGAATGG CTTGACTAAA AACTTGGATT TGTCTGCTTG GAGAGTGACC ATTGAAGCCA 1860 
TCCAGGCGAT TTGGAA 1876 
(2) INFORMATION FOR SEQ ID NO: 85: 

35 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1574 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



45 



50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
TTGGAAGATT TCCCACTTTC AGTGACCAAC CCATACGGTC GTACTAAGCT CATGCTAGAG 60 
55 GAAATTTTGA CTGATATTTA CAAAGCAGAC TCAGAATGGA ATGTTGTCTT GCTTCGTTAC 120 



TTTAACCCAA TCGGAGTCCA TGAGAGTGGT GATTTGGGAG AAAATCCAAA CGGTATTCCA 180 



AACAATCTCT TGCCATATGT GACTCAAGTA GCCGTTGGAA AATTAGAGCA AGTGCAAGTG 240 

60 

TTTGGAGACG ATTACGATAC GGAAGATGGA ACAGGTGTTC GTGACTATAT CCACGTTGTC 300 
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GATTTGGCTA 


AGGGTCACGT 


TGCAGCTTTG AAAAAAATCC 


AAAAAGGTTC 


AGGACTAAAC 


360 


GTTTATAACC 


TTGGAACTGG 


TAAAGGTTAC 


TCAGTTCTTG AAATTATCCA AAACATGGAA 


420 


AAAGCGGTGG 


GATGTCCTAT 


TCCTTACCGC 


ATCGTAGAAC 


GTCGCCCAGG 


TGATATCGCT 


480 


GCCTGCTACT 


CAGACCCAGC 


AAAGGCTAAA 


GCAGAACTCG 


GTTGGGAAGC 


AGAACTCGAC 


540 


ATCACCCAAA 


TGTGTGAAGG 


CCATGGCGTT 


GGCAGAGCAA 


GCATCCAAAT 


GGATTTGAAG 


600 


ACTAAGATGA 


TGATTTCAAT 


CATCGTCCCT 


TGTTTAACGA AGAGGAAGTA 


CTTCCTCTTT 


660 


TTTATCAGGC 


TCTGGAAGCT 


TTACTTCCAG ATTTGGAAAC 


AAAATCGAGT 


ATGTCTTTGT 


720 


CGATGATGGA 


TCAAGTGATG 


GGACCTTGGA ACTCTTAAAG 


GCCTATCGGG 


AGCAAAATCC 


780 


GGCAGTCCAT 


TATATTTCTT 


TCTCTCGAAA 


TTTTGGCAAA 


GAAGCAGCCC 


TTTATGCAGG 


840 


CTTGCAATAT 


GCGACAGGAG 


ATTTGGTGGT 


GGTGATGGAT 


GCAGACCTCC 


AAGATCCTCC 


900 


TAGTATGTTG 


TTTGAGATGA 


AAAATGTACT 


AGACAAAAAT 


GTAGACTTGG 


ACTGCGTTGG 


960 


GACACGGAGA 


ACTAGTCGGG 


AGGGAGAACC 


CTTCTTTCGC 


AGTTTCTGTG 


CTGTTCTCTT 


1020 


TTATCGCCTC 


ATGCAAAAAA 


TCAGCCCAGT 


AGCTCTGCCG 


TCGGGTGTCC 


GTGATTTTCG 


1080 


TATGATGAGA 


AGGTCTGTGG 


TCGATGCCAT 


TTTAAGCTTG 


ACTGAGTCCA 


ATCGTTTTTC 


1140 


TAAGGGACTC 


TTTGCCTGGG 


TCGGCTTTAA AACCCACTAT 


CTGGACTATC 


CAAATGTCGA 


1200 


AAGGCAGGCT 


GGCAAGACCA 


GTTGGAGTTT 


TAGGCAACTC 


TTTTTTTACT 


CCATTGAAGG 


1260 


GATTGTTAAT 


TTTTCAGATT 


TCCCTTTGAC 


TATAGCCTTT 


GTAGCTGGTC 


TCCTATCTTG 


1320 


TTTTCTTTCT 


CTGCTGATGA 


CCTTTTTTGT 


TGTGGTTCGG 


ACCCTCATTT 


TGGGCAATCC 


1380 


GACATCTGGT 


TGGACCTCTC 


TGATGGCTGT 


TATTCTCTAT 


CTTGGAGGCA 


TTCAACTCTT 


1440 


GACCATTGGG 


ATTCTCGGTA 


AGTATAATCA 


GTAAGATTAT 


TTAGAGACTA AAAAAAAGAC 


1500 


CACTTTATCT 


TATCAAAGAA 


AAAGTGACCT 


TCCTGATTTT 


ACAGAAACCT 


AAAGTGAAAA 


1560 


GACTATAATT 


TTCC 










1574 



(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 969 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



WO 98/26072 



PCT/US97/22578 



-151- 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

GTTATAATTA TTGATGATAA TTATAGTAAT GTAAATTTAA GAAATAAAAT TATCCATCAA 60 

TTTGGCTATA CCAATCATAG AATTAAGTTA ATTTTAAGTA ATGAAGATTT AGGTGCAACT 120 

AATGCCAGAA ACATAGGTAT CAAAAATTCT AGAGGTAAGT ATATATCATT TTTAGACGAT 180 

GATGATGAAT ATATGCCAGA TCGAATTTTA AAGTTGATGG CTTGTTTTAA AAAGAGTAGA 240 

ATGAAGAATT TAGCTTTAGT TTATAGTTAT GGCATAATAA TTTATCCAAA TGGTACACGA 300 

GAAGAGGAGA AGACCGATTT TGTTGGAAAT CCCTTGTTTG TTCAAATGGT TCACAATATA 360 

GCAGGTACGT CATTTTGGTT GTGTAAAAAA GAGGTGCTAG AATTAATTAA TGGTTTTGAG 420 

AAAATAGATT CACATCAGGA CGGTGTTGTT TTATTAAAAC TACTTGCTCA AGGATACCAA 480 

ATTGATATAG TGCGAGAATT CTTGGTGAAT TACTACGCTC ACAGTAAAGA AAACGGTATC 540 

ACTGGAGTGA CACAAAAAAC AATTAATGCA GATGAAGAAT ATTATAATTA CTGTAGGAAA 600 

TATTTTAATT TATTGAGTTT CAACGAGAGA ATATTGGTTA CAAAGAAATA TTATTCTTTA 660 

AACATAAAGC GGTTACTATT AATAGGAGAC AAATGCAAGG CTTTAAAAGT AATCAAGAAG 720 

GCAAGAGAAG AAAAAATTTT TAACGAATTT CTTTTTTTGA AATATATGTT ATTATATAAC 780 

GTAGTTTTTT CTATTGTATA TATGACAACT ATGTTCAATT AAAATTTAGA AAGTGAGAAA 840 

CTATTGTGTA TACTATTATA AATTCAATAT AAACATTTAG GTTAATTAAC GATAATTAAT 900 

CGGTGCTGGG TCATTAATTG CTAATTTAAT GCAGCACTAT TAATGCTCAG GTGTTGAATG 960 

AATTAATGC 969 
(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1353 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1350 

(D) OTHER INFORMATION: DNA B 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

ATG GCA GAA GTA GAA GAG TTA CGA GTA CAA CCT CAA GAT ATC TTA GCT 48 
Met Ala Glu Val Glu Glu Leu Arg Val Gin Pro Gin Asp He Leu Ala 
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10 



15 



10 



15 



20 



25 



30 



35 



40 



45 



50 



GAG CAA TCC GTT TTA GGG GCT ATC TTT ATT GAT GAG AGT AAA CTT GTT 96 
Glu Gin Ser Val Leu Gly Ala He Phe He Asp Glu Ser Lys Leu Val 
20 25 30 

TTT GTG CGA GAA TAC ATT GAG TCT CGG GAC TTT TTT AAG TAT GCC CAT 144 
Phe Val Arg Glu Tyr He Glu Ser Arg Asp Phe Phe Lys Tyr Ala His 
35 40 45 

CGT TTG ATT TTC CAA GCC ATG GTC GAT TTA TCC GAT CGT GGT GAT GCC 192 
Arg Leu He Phe Gin Ala Met Val Asp Leu Ser Asp Arg Gly Asp Ala 
50 55 60 

ATA GAT GCA ACA ACG GTT CGT ACT ATC CTT GAT AAT CAA GGT GAT TTA 240 
He Asp Ala Thr Thr Val Arg Thr He Leu Asp Asn Gin Gly Asp Leu 
65 70 75 80 

CAG AAT ATT GGT GGC TTG TCT TAC TTG GTT GAG ATT GTT AAT TCT GTG 288 
Gin Asn He Gly Gly Leu Ser Tyr Leu Val Glu He Val Asn Ser Val 
85 90 95 

CCA ACT TCT GCT AAT GCG GAG TAT TAT GCT AAG ATT GTT GCA GAA AAA 336 
Pro Thr Ser Ala Asn Ala Glu Tyr Tyr Ala Lys He Val Ala Glu Lys 
100 105 HO 

GCA ATG CTA CGT CGT TTA ATT GCC AAG TTG ACA GAG TCT GTC AAC CAA 384 
Ala Met Leu Arg Arg Leu He Ala Lys Leu Thr Glu Ser Val Asn Gin 
115 120 125 

GCT TAC GAA GCG TCA CAA CCA GCT GAT GAA ATT ATT GCT CAG GCA GAA 432 
Ala Tyr Glu Ala Ser Gin Pro Ala Asp Glu He He Ala Gin Ala Glu 
130 135 140 

AAA GGG TTG ATT GAT GTC AGT GAA AAT GCA AAT CGA AGC GGG TTT AAG 480 
Lys Gly Leu He Asp Val Ser Glu Asn Ala Asn Arg Ser Gly Phe Lys 
145 150 155 160 

AAC ATT CGA GAT GTG TTG AAT CTC AAC TTT GGA AAT CTG GAA GCT CGC 528 
Asn He Arg Asp Val Leu Asn Leu Asn Phe Gly Asn Leu Glu Ala Arg 
165 170 175 

TCG CAA CAA ACG ACC GAT ATT ACA GGT ATT GCG ACA GGT TAT CGT GAT 576 
Ser Gin Gin Thr Thr Asp He Thr Gly He Ala Thr Gly Tyr Arg Asp 
180 185 190 

TTG GAT CAT ATG ACA ACA GGA CTT CAT GAG GAG GAG TTG ATT ATC TTA 624 
Leu Asp His Met Thr Thr Gly Leu His Glu Glu Glu Leu He He Leu 
195 200 205 

GCA GCT CGT CCA GCA GTT GGT AAG ACA GCA TTT GCC TTG AAT ATC GCT 672 
Ala Ala Arg Pro Ala Val Gly Lys Thr Ala Phe Ala Leu Asn He Ala 
210 215 220 



55 CAG AAT ATT GGG ACT AAG TTG GAC AAA ACG GTT GCT ATT TTT TCA CTC 
Gin Asn He Gly Thr Lys Leu Asp Lys Thr Val Ala He Phe Ser Leu 
225 230 235 240 



720 



GAA ATG GGT GCG GAA AGC TTG GTA GAC CGT ATG TTA GCT GCA GAA GGC 
60 Glu Met Gly Ala Glu Ser Leu Val Asp Arg Met Leu Ala Ala Glu Gly 

245 250 255 



768 



WO 98/26072 PCT/US97/22578 



-153- 



TTG GTG GAG TCA CAT TCT ATC CGT ACA GGG CAA TTG ACA GAT GAG GAG 816 
Leu Val Glu Ser His Ser lie Arg Thr Gly Gin Leu Thr Asp Glu Glu 
260 265 270 

5 

TGG CAA AAA TAT ACT ATT GCT CAG GGT AAT CTA GCT AAC GCC AGT ATC 864 
Trp Gin Lys Tyr Thr lie Ala Gin Gly Asn Leu Ala Asn Ala Ser lie 
275 280 285 

10 TAT ATC GAT GAT ACG CCA GGT ATT CGG ATT ACA GAG ATT CGT TCT CGT 912 
Tyr lie Asp Asp Thr Pro Gly lie Arg He Thr Glu He Arg Ser Arg 
290 295 300 

TCT CGT AAA TTG GCT CAA GAA ACT GGA AAT CTT GGT TTG ATT TTG ATA 960 
15 Ser Arg Lys Leu Ala Gin Glu Thr Gly Asn Leu Gly Leu He Leu He 
305 310 315 320 

GAC TAT TTG CAA CTT ATC ACG GGA ACT GGT CGA GAA AAT CGT CAA CAA 1008 
Asp Tyr Leu Gin Leu lie Thr Gly Thr Gly Arg Glu Asn Arg Gin Gin 
20 325 330 335 

GAA GTT TCT GAA ATT TCT CGT CAG TTG AAA ATA CTA GCC AAG GAA TTG 1056 

Glu Val Ser Glu He Ser Arg Gin Leu Lys He Leu Ala Lys Glu Leu 

340 345 350 

25 

AAG GTT CCA GTA ATC GCT CTG AGT CAG CTT TCT CGT GGT GTA GAA CAA 1104 

Lys Val Pro Val He Ala Leu Ser Gin Leu Ser Arg Gly Val Glu Gin 

355 360 365 

30 CGT CAG GAC AAG AGA CCG GTC TTG TCT GAT ATT CGT GAA TCT GGG TCT 1152 
Arg Gin Asp Lys Arg Pro Val Leu Ser Asp lie Arg Glu Ser Gly Ser 
370 375 380 

ATT GAG CAG GAC GCT GAT ATC GTA GCT TTT CTC TAT CGC GAT GAC TAC 1200 
35 He Glu Gin Asp Ala Asp lie Val Ala Phe Leu Tyr Arg Asp Asp Tyr 
385 390 395 400 

TAT GAA CGT GGT GGT GAA GAA GAG GAG GGT ATC CCA AAT AAT AAG GTG 1248 
Tyr Glu Arg Gly Gly Glu Glu Glu Glu Gly lie Pro Asn Asn Lys Val 
40 405 410 415 

GAA GTT ATT ATC GAG AAA AAC CGT AGT GGA GCT CGT GGA ACA GTG GAA 1296 

Glu Val He lie Glu Lys Asn Arg Ser Gly Ala Arg Gly Thr Val Glu 
420 425 430 

45 

TTG ATT TTC CAA AAA GAA TAC AAT AAA TTT TCA AGT ATC TCA AAG AGG 1344 

Leu lie Phe Gin Lys Glu Tyr Asn Lys Phe Ser Ser lie Ser Lys Arg 
435 440 445 

50 GAG GCA TAA 1353 
Glu Ala 
450 

55 (2) INFORMATION FOR SEQ ID NO:88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 450 amino acids 

(B) TYPE: amino acid 
60 (D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

5 Met Ala Glu Val Glu Glu Leu Axg Val Gin Pro Gin Asp He Leu Ala 
15 10 15 

Glu Gin Ser Val Leu Gly Ala He Phe He Asp Glu Ser Lys Leu Val 
20 25 30 

10 

Phe Val Arg Glu Tyr He Glu Ser Arg Asp Phe Phe Lys Tyr Ala His 
35 40 45 

Arg Leu He Phe Gin Ala Met Val Asp Leu Ser Asp Arg Gly Asp Ala 
15 50 55 60 

He Asp Ala Thr Thr Val Arg Thr He Leu Asp Asn Gin Gly Asp Leu 
65 70 75 80 

20 Gin Asn He Gly Gly Leu Ser Tyr Leu Val Glu He Val Asn Ser Val 

85 90 95 

Pro Thr Ser Ala Asn Ala Glu Tyr Tyr Ala Lys He Val Ala Glu Lys 
100 105 110 

25 

Ala Met Leu Arg Arg Leu He Ala Lys Leu Thr Glu Ser Val Asn Gin 
115 120 125 

Ala Tyr Glu Ala Ser Gin Pro Ala Asp Glu He He Ala Gin Ala Glu 
30 130 135 140 

Lys Gly Leu He Asp Val Ser Glu Asn Ala Asn Arg Ser Gly Phe Lys 
145 150 155 160 

35 Asn He Arg Asp Val Leu Asn Leu Asn Phe Gly Asn Leu Glu Ala Arg 

165 170 175 

Ser Gin Gin Thr Thr Asp He Thr Gly He Ala Thr Gly Tyr Arg Asp 
180 185 190 

40 

Leu Asp His Met Thr Thr Gly Leu His Glu Glu Glu Leu He He Leu 
195 200 205 

Ala Ala Arg Pro Ala Val Gly Lys Thr Ala Phe Ala Leu Asn He Ala 
45 210 215 220 

Gin Asn He Gly Thr Lys Leu Asp Lys Thr Val Ala He Phe Ser Leu 
225 230 235 240 

50 Glu Met Gly Ala Glu Ser Leu Val Asp Arg Met Leu Ala Ala Glu Gly 

245 250 255 

Leu Val Glu Ser His Ser He Arg Thr Gly Gin Leu Thr Asp Glu Glu 
260 265 270 

55 

Trp Gin Lys Tyr Thr He Ala Gin Gly Asn Leu Ala Asn Ala Ser He 
275 280 285 

Tyr He Asp Asp Thr Pro Gly He Arg He Thr Glu He Arg Ser Arg 
60 290 295 300 
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Ser Arg Lys Leu Ala Gin Glu Thr Gly Asn Leu Gly Leu He Leu He 
305 310 315 320 

Asp Tyr Leu Gin Leu He Thr Gly Thr Gly Arg Glu Asn Arg Gin Gin 
325 330 335 

Glu Val Ser Glu He Ser Arg Gin Leu Lys lie Leu Ala Lys Glu Leu 
340 345 350 

Lys Val Pro Val lie Ala Leu Ser Gin Leu Ser Arg Gly Val Glu Gin 
355 360 365 

Arg Gin Asp Lys Arg Pro Val Leu Ser Asp lie Arg Glu Ser Gly Ser 
370 375 380 

He Glu Gin Asp Ala Asp lie Val Ala Phe Leu Tyr Arg Asp Asp Tyr 
385 390 395 400 

Tyr Glu Arg Gly Gly Glu Glu Glu Glu Gly lie Pro Asn Asn Lys Val 
405 410 415 

Glu Val He He Glu Lys Asn Arg Ser Gly Ala Arg Gly Thr Val Glu 
420 425 430 

Leu lie Phe Gin Lys Glu Tyr Asn Lys Phe Ser Ser He Ser Lys Arg 
435 440 445 

Glu Ala 
450 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1785 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..1782 

(D) OTHER INFORMATION: DNA G 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

ATG ATA ACC ATG GAG GTA TTG TGT ATG GTT GAC AAA CAA GTC ATT GAA 
Met lie Thr Met Glu Val Leu Cys Met Val Asp Lys Gin Val lie Glu 
15 10 15 

GAA ATC AAA AAC AAT GCC AAC ATT GTG GAA GTC ATA GGA GAT GTG ATT 
Glu lie Lys Asn Asn Ala Asn He Val Glu Val lie Gly Asp Val lie 
20 25 30 
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TCT TTA CAA AAG GCA GGA CGG AAC TAT CTA GGG CTC TGT CCT TTT CAT 144 
Ser Leu Gin Lys Ala Gly Arg Asn Tyr Leu Gly Leu Cys Pro Phe His 
35 40 45 

5 GGT GAA AAA ACA CCT TCT TTC AGC GTT GTA GAG GAC AAG CAG TTT TAC 192 
Gly Glu Lys Thr Pro Ser Phe Ser Val Val Glu Asp Lys Gin Phe Tyr 
50 55 60 

CAC TGT TTT GGT TGT GGT CGC TCA GGT GAT GTC TTT AAA TTC ATC GAG 240 
10 His Cys Phe Gly Cys Gly Arg Ser Gly Asp Val Phe Lys Phe He Glu 
65 70 75 80 

GAG TAC CAA GGG GTT ACC TTT ATG GAG GCT GTC CAA ATC TTA GGT CAG 288 
Glu Tyr Gin Gly Val Thr Phe Met Glu Ala Val Gin He Leu Gly Gin 
15 85 90 95 

CGT GTC GGG ATT GAG GTT GAA AAA CCG CTT TAT AGT GAA CAG AAG CCA 336 
Arg Val Gly He Glu Val Glu Lys Pro Leu Tyr Ser Glu Gin Lys Pro 
100 105 110 



20 



40 



GCC TCG CCT CAC CAA GCT CTT TAT GAT ATG CAC GAA GAT GCG GCT AAA 384 
Ala Ser Pro His Gin Ala Leu Tyr Asp Met His Glu Asp Ala Ala Lys 
115 120 125 



25 TTT TAC CAT GCT ATT CTC ATG ACA ACG ACT ATG GGC GAA GAG GCC AGA 432 
Phe Tyr His Ala He Leu Met Thr Thr Thr Met Gly Glu Glu Ala Arg 
130 135 140 

AAT TAC CTT TAT CAG CGG GGT TTG ACA GAT GAA GTG CTT AAA CAT TTT 480 
30 Asn Tyr Leu Tyr Gin Arg Gly Leu Thr Asp Glu Val Leu Lys His Phe 
145 150 155 160 

TGG ATT GGT TTA GCA CCT CCA GAA CGA AAC TAT CTC TAT CAA CGT TTG 528 
Trp He Gly Leu Ala Pro Pro Glu Arg Asn Tyr Leu Tyr Gin Arg Leu 
35 165 170 175 

TCT GAT CAG TAT CGT GAA GAG GAT TTA CTG GAT TCA GGC CTG TTT TAT 576 
Ser Asp Gin Tyr Arg Glu Glu Asp Leu Leu Asp Ser Gly Leu Phe Tyr 
180 185 190 



CTT TCG GAT GCC AAT CAA TTT GTA GAC ACC TTT CAC AAT CGC ATT ATG 624 
Leu Ser Asp Ala Asn Gin Phe Val Asp Thr Phe His Asn Arg He Met 
195 200 205 



45 TTT CCC CTG ACA AAT GAC CAA GGA AAG GTC ATT GCC TTC TCA GGT CGT 672 
Phe Pro Leu Thr Asn Asp Gin Gly Lys Val He Ala Phe Ser Gly Arg 
210 215 220 

ATC TGG CAA AAA ACG GAT TCA CAA ACT TCT AAG TAT AAA AAC AGC CGA 720 
50 He Trp Gin Lys Thr Asp Ser Gin Thr Ser Lys Tyr Lys Asn Ser Arg 
225 230 235 240 

TCG ACT GTA ATT TTT AAC AAA AGT TAC GAA TTA TAT CAT ATG GAT AGG 768 
Ser Thr Val He Phe Asn Lys Ser Tyr Glu Leu Tyr His Met Asp Arg 
55 245 250 255 

GCA AAA AGA TCT TCT GGA AAA GCT AGT GAG ATT TAC CTG ATG GAA GGA 816 
Ala Lys Arg Ser Ser Gly Lys Ala Ser Glu He Tyr Leu Met Glu Gly 
260 265 270 



60 



TTC ATG GAT GTT ATT GCA GCC TAT CGG GCT GGA ATC GAA AAT GCT GTG 864 
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Phe Met Asp Val lie Ala Ala Tyr Arg Ala Gly lie Glu Asn Ala Val 
275 280 285 

GCG TCG ATG GGA ACG GCC TTG AGT CGA GAG CAT GTT GAG CAT CTG AAA 912 

5 Ala Ser Met Gly Thr Ala Leu Ser Arg Glu His Val Glu His Leu Lys 

290 295 300 

AGG TTA ACC AAG AAA TTG GTT CTT GTT TAC GAT GGA GAT AAG GCT GGG 960 

Arg Leu Thr Lys Lys Leu Val Leu Val Tyr Asp Gly Asp Lys Ala Gly 
10 305 310 315 320 

CAA GCC GCG ACA TTG AAA GCA TTG GAT GAA ATT GGT GAT ATG CCT GTG 1008 

Gin Ala Ala Thr Leu Lys Ala Leu Asp Glu lie Gly Asp Met Pro Val 
325 330 335 



15 



35 



CAA ATC GTC AGC ATG CCT GAT AAC TTG GAT CCT GAT GAA TAT CTA CAA 1056 
Gin lie Val Ser Met Pro Asp Asn Leu Asp Pro Asp Glu Tyr Leu Gin 
340 345 350 



20 AAA AAT GGT CCA GAA GAC TTG GCC TAT CTA TTA ACG AAA ACT CGT ATT 1104 
Lys Asn Gly Pro Glu Asp Leu Ala Tyr Leu Leu Thr Lys Thr Arg lie 
355 360 365 

AGT CCG ATT GAG TTC TAC ATT CAT CAG TAC AAA CCT GAA AAC GGT GAA 1152 
25 Ser Pro He Glu Phe Tyr He His Gin Tyr Lys Pro Glu Asn Gly Glu 
370 375 380 

AAT CTG CAG GCT CAG ATT GAG TTT CTT GAA AAA ATA GCT CCC TTG ATT 1200 
Asn Leu Gin Ala Gin He Glu Phe Leu Glu Lys He Ala Pro Leu He 
30 385 390 395 400 

GTT CAA GAA AAG TCC ATC GCT GCT CAA AAC AGC TAT ATT CAT ATT TTA 1248 
Val Gin Glu Lys Ser He Ala Ala Gin Asn Ser Tyr He His He Leu 
405 410 415 



GCT GAC AGT CTG GCG TCC TTT GAT TAT ACC CAG ATT GAG CAG ATT GTT 1296 
Ala Asp Ser Leu Ala Ser Phe Asp Tyr Thr Gin He Glu Gin He Val 
420 425 430 



40 AAT GAG AGT CGT CAG GTG CAA AGG CAG AAT CGC ATG GAA AGA ATT TCC 1344 
Asn Glu Ser Arg Gin Val Gin Arg Gin Asn Arg Met Glu Arg He Ser 
435 440 445 

AGA CCG ACG CCA ATC ACC ATG CCT GTC ACC AAG CAG TTA TCG GCT ATT 1392 
45 Arg Pro Thr Pro He Thr Met Pro Val Thr Lys Gin Leu Ser Ala He 
450 455 460 

ATG AGG GCA GAA GCC CAT CTA CTC TAT CGG ATG ATG GAA TCC CCT CTT 1440 
Met Arg Ala Glu Ala His Leu Leu Tyr Arg Met Met Glu Ser Pro Leu 
50 465 470 475 480 

GTT TTG AAC GAT TAC CGT TTG CGA GAA GAC TTT GCA TTT GCT ACA CCT 1488 

Val Leu Asn Asp Tyr Arg Leu Arg Glu Asp Phe Ala Phe Ala Thr Pro 
485 490 495 

55 

GAA TTT CAG GTC TTA CAT GAC TTG CTT GGC CAG TAT GGA AAT CTT CCT 1536 

Glu Phe Gin Val Leu His Asp Leu Leu Gly Gin Tyr Gly Asn Leu Pro 
500 505 510 

60 CCA GAA GTT TTA GCA GAG CAG ACA GAG GAA GTT GAA AGA GCT TGG TAC 1584 
Pro Glu Val Leu Ala Glu Gin Thr Glu Glu Val Glu Arg Ala Trp Tyr 
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515 520 525 

CAA GTT TTA GCT CAG GAT TTG CCT GCT GAG ATA TCG CCG CAG GAA CTT 1632 
Gin Val Leu Ala Gin Asp Leu Pro Ala Glu lie Ser Pro Gin Glu Leu 
5 530 535 540 

AGT GAA GTA GAG ATG ACT CGA AAC AAG GCT CTC TTG AAT CAG GAC AAT 1680 

Ser Glu Val Glu Met Thr Arg Asn Lys Ala Leu Leu Asn Gin Asp Asn 
545 550 555 560 

10 

ATG AGA ATC AAA AAG AAG GTG CAG GAA GCT AGC CAT GTA GGA GAT ACA 1728 

Met Arg lie Lys Lys Lys Val Gin Glu Ala Ser His Val Gly Asp Thr 
565 570 575 

15 GAT ACA GCC CTA GAA GAA TTG GAA CGT TTA ATT TCC CAA AAG AGA AGA 1776 
Asp Thr Ala Leu Glu Glu Leu Glu Arg Leu lie Ser Gin Lys Arg Arg 
580 585 590 

ATG GAG TAA 1785 
20 Met Glu 



(2) INFORMATION FOR SEQ ID NO: 90: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 594 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

35 Met He Thr Met Glu Val Leu Cys Met Val Asp Lys Gin Val He Glu 
15 10 15 



40 



Glu He Lys Asn Asn Ala Asn He Val Glu Val He Gly Asp Val He 
20 25 30 

Ser Leu Gin Lys Ala Gly Arg Asn Tyr Leu Gly Leu Cys Pro Phe His 
35 40 45 



Gly Glu Lys Thr Pro Ser Phe Ser Val Val Glu Asp Lys Gin Phe Tyr 
45 50 55 60 

His Cys Phe Gly Cys Gly Arg Ser Gly Asp Val Phe Lys Phe He Glu 
65 70 75 80 

50 Glu Tyr Gin Gly Val Thr Phe Met Glu Ala Val Gin He Leu Gly Gin 

85 90 95 

Arg Val Gly He Glu Val Glu Lys Pro Leu Tyr Ser Glu Gin Lys Pro 
100 105 110 

55 

Ala Ser Pro His Gin Ala Leu Tyr Asp Met His Glu Asp Ala Ala Lys 
115 120 125 

Phe Tyr His Ala He Leu Met Thr Thr Thr Met Gly Glu Glu Ala Arg 
60 130 135 140 
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Asn Tyr Leu Tyr Gin Arg Gly Leu Thr Asp Glu Val Leu Lys His Phe 
145 150 155 160 

Trp lie Gly Leu Ala Pro Pro Glu Arg Asn Tyr Leu Tyr Gin Arg Leu 
5 165 170 175 

Ser Asp Gin Tyr Arg Glu Glu Asp Leu Leu Asp Ser Gly Leu Phe Tyr 
180 185 190 

10 Leu Ser Asp Ala Asn Gin Phe Val Asp Thr Phe His Asn Arg lie Met 
195 200 205 



15 



Phe Pro Leu Thr Asn Asp Gin Gly Lys Val lie Ala Phe Ser Gly Arg 
210 215 220 

lie Trp Gin Lys Thr Asp Ser Gin Thr Ser Lys Tyr Lys Asn Ser Arg 

225 230 235 240 



Ser Thr Val lie Phe Asn Lys Ser Tyr Glu Leu Tyr His Met Asp Arg 

20 245 250 255 

Ala Lys Arg Ser Ser Gly Lys Ala Ser Glu lie Tyr Leu Met Glu Gly 
260 265 270 

25 Phe Met Asp Val He Ala Ala Tyr Arg Ala Gly He Glu Asn Ala Val 
275 280 285 



30 



Ala Ser Met Gly Thr Ala Leu Ser Arg Glu His Val Glu His Leu Lys 
290 295 300 

Arg Leu Thr Lys Lys Leu Val Leu Val Tyr Asp Gly Asp Lys Ala Gly 
305 310 315 320 



Gin Ala Ala Thr Leu Lys Ala Leu Asp Glu He Gly Asp Met Pro Val 
35 325 330 335 

Gin He Val Ser Met Pro Asp Asn Leu Asp Pro Asp Glu Tyr Leu Gin 
340 345 350 

40 Lys Asn Gly Pro Glu Asp Leu Ala Tyr Leu Leu Thr Lys Thr Arg He 
355 360 365 

Ser Pro He Glu Phe Tyr He His Gin Tyr Lys Pro Glu Asn Gly Glu 
370 375 380 

45 

Asn Leu Gin Ala Gin He Glu Phe Leu Glu Lys He Ala Pro Leu He 
385 390 395 400 

Val Gin Glu Lys Ser He Ala Ala Gin Asn Ser Tyr lie His He Leu 
50 405 410 415 

Ala Asp Ser Leu Ala Ser Phe Asp Tyr Thr Gin He Glu Gin He Val 
420 425 430 

55 Asn Glu Ser Arg Gin Val Gin Arg Gin Asn Arg Met Glu Arg He Ser 
435 440 445 

Arg Pro Thr Pro He Thr Met Pro Val Thr Lys Gin Leu Ser Ala He 

450 455 460 

60 

Met Arg Ala Glu Ala His Leu Leu Tyr Arg Met Met Glu Ser Pro Leu 
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465 470 475 480 

Val Leu Asn Asp Tyr Arg Leu Arg Glu Asp Phe Ala Phe Ala Thr Pro 
485 490 495 

5 

Glu Phe Gin Val Leu His Asp Leu Leu Gly Gin Tyr Gly Asn Leu Pro 
500 505 510 

Pro Glu Val Leu Ala Glu Gin Thr Glu Glu Val Glu Arg Ala Trp Tyr 
10 515 520 525 

Gin Val Leu Ala Gin Asp Leu Pro Ala Glu lie Ser Pro Gin Glu Leu 
530 535 540 

15 Ser Glu Val Glu Met Thr Arg Asn Lys Ala Leu Leu Asn Gin Asp Asn 
545 . 550 555 560 



20 



25 



Met Arg lie Lys Lys Lys Val Gin Glu Ala Ser His Val Gly Asp Thr 
565 570 575 

Asp Thr Ala Leu Glu Glu Leu Glu Arg Leu lie Ser Gin Lys Arg Arg 
580 585 590 



Met Glu 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 900 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

40 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..897 

45 (D) OTHER INFORMATION: Era 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:91: 

50 ATG ACT TTT AAA TCA GGC TTT GTA GCC ATT TTA GGA CGT CCC AAT GTT 48 
Met Thr Phe Lys Ser Gly Phe Val Ala lie Leu Gly Arg Pro Asn Val 
15 10 15 

GGG AAG TCA ACC TTT TTA AAT CAC GTT ATG GGG CAA AAG ATT GCC ATC 96 
55 Gly Lys Ser Thr Phe Leu Asn His Val Met Gly Gin Lys He Ala He 
20 25 30 

ATG AGT GAC AAG GCG CAG ACA ACG CGC AAT AAA ATC ATG GGA ATT TAC 144 
Met Ser Asp Lys Ala Gin Thr Thr Arg Asn Lys He Met Gly He Tyr 
60 35 40 45 
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ACG ACT GAT AAG GAG CAA ATT GTC TTT ATC GAC ACA CCA GGG ATT CAC 192 
Thr Thr Asp Lys Glu Gin lie Val Phe lie Asp Thr Pro Gly lie His 
50 55 60 

5 AAA CCT AAA ACA GCT CTC GGA GAT TTC ATG GTT GAG TCT GCC TAC AGT 240 
Lys Pro Lys Thr Ala Leu Gly Asp Phe Met Val Glu Ser Ala Tyr Ser 
65 70 75 80 

ACC CTT CGC GAA GTG GAC ACT GTT CTT TTC ATG GTG CCT GCT GAT GAA 288 
10 Thr Leu Arg Glu Val Asp Thr Val Leu Phe Met Val Pro Ala Asp Glu 

85 90 95 

GCG CGT GGT AAG GGG GAC GAT ATG ATT ATC GAG CGT CTC AAG GCT GCC 336 
Ala Arg Gly Lys Gly Asp Asp Met lie lie Glu Arg Leu Lys Ala Ala 
15 100 105 110 

AAG GTT CCT GTG ATT TTG GTG GTG AAT AAA ATC GAT AAG GTC CAT CCA 384 
Lys Val Pro Val He Leu Val Val Asn Lys He Asp Lys Val His Pro 
115 120 125 

20 

GAC CAG CTC TTG TCT CAG ATT GAT GAC TTC CGT AAT CAA ATG GAC TTT 432 
Asp Gin Leu Leu Ser Gin He Asp Asp Phe Arg Asn Gin Met Asp Phe 
130 135 140 

25 AAG GAA ATT GTT CCA ATC TCA GCC CTT CAG GGA AAT AAC GTG TCT CGT 480 
Lys Glu He Val Pro He Ser Ala Leu Gin Gly Asn Asn Val Ser Arg 
145 150 155 160 

CTA GTG GAT ATT TTG AGT GAA AAT CTG GAT GAA GGT TTC CAA TAT TTC 528 
30 Leu Val Asp He Leu Ser Glu Asn Leu Asp Glu Gly Phe Gin Tyr Phe 

165 170 175 

CCG TCT GAT CAA ATC ACA GAT CAT CCA GAA CGT TTC TTA GTT TCA GAA 576 
Pro Ser Asp Gin He Thr Asp His Pro Glu Arg Phe Leu Val Ser Glu 
35 180 185 190 

ATG GTT CGC GAG AAA GTC TTG CAC CTA ACT CGT GAA GAG ATT CCG CAT 624 
Met Val Arg Glu Lys Val Leu His Leu Thr Arg Glu Glu He Pro His 
195 200 205 

40 

TCT GTA GCA GTA GTT GTT GAC TCT ATG AAA CGA GAC GAA GAG ACA GAC 672 
Ser Val Ala Val Val Val Asp Ser Met Lys Arg Asp Glu Glu Thr Asp 
210 215 220 

45 AAG GTT CAC ATC CGT GCA ACC ATC ATG GTC GAG CGC GAT AGC CAA AAA 720 
Lys Val His He Arg Ala Thr He Met Val Glu Arg Asp Ser Gin Lys 
225 230 235 240 

GGG ATT ATC ATC GGT AAA GGT GGC GCT ATG CTT AAG AAA ATC GGT AGT 768 
50 Gly He He He Gly Lys Gly Gly Ala Met Leu Lys Lys He Gly Ser 

245 250 255 

ATG GCC CGT CGT GAT ATC GAA CTC ATG CTA GGA GAC AAG GTC TTC CTA 816 
Met Ala Arg Arg Asp He Glu Leu Met Leu Gly Asp Lys Val Phe Leu 
55 260 265 270 

GAA ACC TGG GTC AAG GTC AAG AAA AAC TGG CGC GAT AAA AAG CTA GAT 864 
Glu Thr Trp Val Lys Val Lys Lys Asn Trp Arg Asp Lys Lys Leu Asp 
275 280 285 



60 



TTG GCT GAC TTT GGC TAT AAT GAA AGA GAA TAC TAA 900 
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Leu Ala Asp Phe Gly Tyr Asn Glu Arg Glu Tyr 
290 295 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 299 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Met Thr Phe Lys Ser Gly Phe Val Ala lie Leu Gly Arg Pro Asn Val 
15 10 15 

Gly Lys Ser Thr Phe Leu Asn His Val Met Gly Gin Lys He Ala He 
20 25 30 

Met Ser Asp Lys Ala Gin Thr Thr Arg Asn Lys He Met Gly He Tyr 
35 40 45 

Thr Thr Asp Lys Glu Gin He Val Phe He Asp Thr Pro Gly He His 
50 55 60 

Lys Pro Lys Thr Ala Leu Gly Asp Phe Met Val Glu Ser Ala Tyr Ser 
65 70 75 80 

Thr Leu Arg Glu Val Asp Thr Val Leu Phe Met Val Pro Ala Asp Glu 
85 90 95 

Ala Arg Gly Lys Gly Asp Asp Met He He Glu Arg Leu Lys Ala Ala 
100 105 HO 

Lys Val Pro Val He Leu Val Val Asn Lys He Asp Lys Val His Pro 
115 120 125 

Asp Gin Leu Leu Ser Gin He Asp Asp Phe Arg Asn Gin Met Asp Phe 
130 135 140 

Lys Glu He Val Pro He Ser Ala Leu Gin Gly Asn Asn Val Ser Arg 
145 150 155 160 

Leu Val Asp He Leu Ser Glu Asn Leu Asp Glu Gly Phe Gin Tyr Phe 
165 170 175 

Pro Ser Asp Gin He Thr Asp His Pro Glu Arg Phe Leu Val Ser Glu 
180 185 190 

Met Val Arg Glu Lys Val Leu His Leu Thr Arg Glu Glu He Pro His 
195 200 205 

Ser Val Ala Val Val Val Asp Ser Met Lys Arg Asp Glu Glu Thr Asp 
210 215 220 

Lys Val His He Arg Ala Thr He Met Val Glu Arg Asp Ser Gin Lys 
225 230 235 240 

Gly He He He Gly Lys Gly Gly Ala Met Leu Lys Lys He Gly Ser 
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245 250 255 

Met Ala Arg Arg Asp lie Glu Leu Met Leu Gly Asp Lys Val Phe Leu 
260 265 270 

Glu Thr Trp Val Lys Val Lys Lys Asn Trp Arg Asp Lys Lys Leu Asp 
275 280 285 

Leu Ala Asp Phe Gly Tyr Asn Glu Arg Glu Tyr 
290 295 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1011 base pairs 

(B) TYPE: nucleic acid 

. • (C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



<ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..1008 

(D) OTHER INFORMATION: Gcp 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

ATG AAG GAT AGA TAT ATT TTA GCA TTT GAG ACA TCC TGT GAT GAG ACC 
Met Lys Asp Arg Tyr He Leu Ala Phe Glu Thr Ser Cys Asp Glu Thr 
15 10 15 

AGT GTC GCC GTC TTG AAA AAC GAC GAT GAG CTC TTG TCC AAT GTC ATT 
Ser Val Ala Val Leu Lys Asn Asp Asp Glu Leu Leu Ser Asn Val He 
20 25 30 

GCT AGT CAA ATT GAG AGT CAC AAA CGT TTT GGT GGC GTA GTG CCC GAA 
Ala Ser Gin He Glu Ser His Lys Arg Phe Gly Gly Val Val Pro Glu 
35 40 45 

GTA GCC AGT CGT CAC CAT GTC GAG GTC ATT ACA GCC TGT ATC GAG GAG 
Val Ala Ser Arg His His Val Glu Val He Thr Ala Cys lie Glu Glu 
50 55 60 

GCA TTG GCA GAA GCA GGG ATT ACC GAA GAG GAC GTG ACA GCT GTT GCG 
Ala Leu Ala Glu Ala Gly He Thr Glu Glu Asp Val Thr Ala Val Ala 
65 70 75 80 

GTT ACC TAC GGA CCA GGC TTG GTC GGA GCC TTG CTA GTT GGT TTG TCA 
Val Thr Tyr Gly Pro Gly Leu Val Gly Ala Leu Leu Val Gly Leu Ser 
85 90 95 



GCT GCC AAG GCC TTT GCT TGG GCT CAC GGA CTT CCA CTG ATT CCT GTT 
Ala Ala Lys Ala Phe Ala Trp Ala His Gly Leu Pro Leu He Pro Val 
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(2) INFORMATION FOR SEQ ID NO: 94: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 336 amino acids 
5 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Met Lys Asp Arg Tyr lie Leu Ala Phe Glu Thr Ser Cys Asp Glu Thr 
15 10 15 

15 Ser Val Ala Val Leu Lys Asn Asp Asp Glu Leu Leu Ser Asn Val lie 
20 25 30 

Ala Ser Gin He Glu Ser His Lys Arg Phe Gly Gly Val Val Pro Glu 
35 40 45 

20 

Val Ala Ser Arg His His Val Glu Val He Thr Ala Cys He Glu Glu 
50 55 60 

Ala Leu Ala Glu Ala Gly He Thr Glu Glu Asp Val Thr Ala Val Ala 
25 65 70 75 80 

Val Thr Tyr Gly Pro Gly Leu Val Gly Ala Leu Leu Val Gly Leu Ser 
85 90 95 

30 Ala Ala Lys Ala Phe Ala Trp Ala His Gly Leu Pro Leu He Pro Val 
100 105 . 110 

Asn His Met Ala Gly His Leu Met Ala Ala Gin Ser Val Glu Pro Leu 
115 120 125 

35 

Glu Phe Pro Leu Leu Ala Leu Leu Val Ser Gly Gly His Thr Glu Leu 
130 135 140 

Val Tyr Val Ser Glu Ala Gly Asp Tyr Lys He Val Gly Glu Thr Arg 
40 145 150 155 160 

Asp Asp Ala Val Gly Glu Ala Tyr Asp Lys Val Gly Arg Val Met Gly 
165 170 175 

45 Leu Thr Tyr Pro Ala Gly Arg Glu He Asp Glu Leu Ala His Gin Gly 
180 185 190 

His Asp He Tyr Asp Phe Pro Arg Ala Met He Lys Glu Asp Asn Leu 
195 200 205 

50 

Glu Phe Ser Phe Ser Gly Leu Lys Ser Ala Phe He Asn Leu His His 
210 215 220 

Asn Ala Glu Gin Lys Gly Glu Ser Leu Ser Thr Glu Asp Leu Cys Ala 
55 225 230 235 240 

Ser Phe Gin Ala Ala Val Met Asp He Leu Met Ala Lys Thr Lys Lys 
245 250 255 

60 Ala Leu Glu Lys Tyr Pro Val Lys Thr Leu Val Val Ala Gly Gly Val 
260 265 270 



10 



15 



25 



40 
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Ala Ala Asn Lys Gly Leu Arg Glu Arg Leu Ala Thr Glu lie Thr Asp 
275 280 285 

Val Asn Val lie lie Pro Pro Leu Arg Leu Cys Gly Asp Asn Ala Gly 
290 295 300 

Met lie Ala Tyr Ala Ser Val Ser Glu Trp Asn Lys Glu Asn Phe Ala 
305 310 315 320 

Asn Leu Asp Leu Asn Ala Lys Pro Ser Leu Ala Phe Asp Thr Met Glu 
325 330 335 

(2) INFORMATION FOR SEQ ID NO: 95: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 774 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 
30 (A) NAME/ KEY: CDS 

<B) LOCATION: 1..771 

(D) OTHER INFORMATION: HI0454 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

ATG ATT TTT GAT ACA CAT ACA CAC TTG AAT GTA GAA GAA TTT GCA GGT 48 
Met He Phe Asp Thr His Thr His Leu Asn Val Glu Glu Phe Ala Gly 
1 5 10 15 

CGT GAG GCA GAA GAA ATT GCC TTG GCT GCT GAG ATG GGT GTG ACA CAG 96 
Arg Glu Ala Glu Glu He Ala Leu Ala Ala Glu Met Gly Val Thr Gin 
20 25 30 

45 ATG AAT ATT GTT GGT TTT GAT AAA CCG ACG ATT GAG CAT GCC TTG GAG 144 
Met Asn He Val Gly Phe Asp Lys Pro Thr He Glu His Ala Leu Glu 
35 40 45 

TTG GTA GAT GAG TAT GAG CAG CTC TAT GCG ACT ATT GGT TGG CAT CCT 192 
50 Leu Val Asp Glu Tyr Glu Gin Leu Tyr Ala Thr He Gly Trp His Pro 
50 55 60 

ACA GAA GCT GGT ACT TAT ACA GAG GAA GTT GAG GCT TAC TTG TTG GAT 240 
Thr Glu Ala Gly Thr Tyr Thr Glu Glu Val Glu Ala Tyr Leu Leu Asp 
55 65 70 * 75 80 

AAG TTA AAA CAT TCC AAG GTT GTG GCT TTA GGT GAA ATT GGC TTA GAC 288 
Lys Leu Lys His Ser Lys Val Val Ala Leu Gly Glu He Gly Leu Asp 
85 90 95 



60 



TAC CAT TGG ATG ACA GCG CCC AAA GAG GTG CAG GAG CAG GTT TTT CGC 336 
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Tyr His Trp Met Thr Ala Pro Lys Glu Val Gin Glu Gin Val Phe Arg 
100 105 110 

CGT CAG ATT CAG CTA TCT AAG GAC TTG GAT TTG CCT TTT GTT GTC CAT 384 

5 Arg Gin lie Gin Leu Ser Lys Asp Leu Asp Leu Pro Phe Val Val His 
115 120 125 

ACC CGT GAT GCG CTG GAA GAT ACC TAT GAG ATT ATC AAG AGT GAG GGC 432 

Thr Arg Asp Ala Leu Glu Asp Thr Tyr Glu He He Lys Ser Glu. Gly 
10 130 135 140 

GTT GGT CCT CGT GGT GGT ATC ATG CAT TCA TTT TCA GGG ACG CTT GAG 480 

Val Gly Pro Arg Gly Gly He Met His Ser Phe Ser Gly Thr Leu Glu 

145 150 155 160 

15 

TGG GCA GAG AAG TTT GTG GAT CTT GGT ATG ACC ATT TCC TTC TCA GGA 528 

Trp Ala Glu Lys Phe Val Asp Leu Gly Met Thr He Ser Phe Ser Gly 
165 170 175 

20 GTG GTG ACC TTC AAG AAG GCA ACT GAC CTC CAA GAA GCA GCT AAA GAG 576 

Val Val Thr Phe Lys Lys Ala Thr Asp Leu Gin Glu Ala Ala Lys Glu 
180 185 190 

TTA CCT TTG GAC AAG ATG TTG GTA GAA ACA GAT GCG CCT TAC TTA GCA 624 

25 Leu Pro Leu Asp Lys Met Leu Val Glu Thr Asp Ala Pro Tyr Leu Ala 
195 200 205 

CCT GTA CCC AAG CGT GGT CGT GAA AAT AAA ACA GCC TAT ACT CGC TAT 672 

Pro Val Pro Lys Arg Gly Arg Glu Asn Lys Thr Ala Tyr Thr Arg Tyr 
30 210 215 220 

GTG GTC GAC TTT ATC GCT GAC TTG CGT GGT ATG ACG ACA GAA GAG CTG 720 

Val Val Asp Phe He Ala Asp Leu Arg Gly Met Thr Thr Glu Glu Leu 

225 230 235 240 



35 



55 



GCG GTA GCA ACG ACT GCA AAT GCA GAA CGC ATT TTT GGA TTG GAC AGC 768 
Ala Val Ala Thr Thr Ala Asn Ala Glu Arg He Phe Gly Leu Asp Ser 
245 250 255 



40 AAG TAA 774 
Lys 



45 (2) INFORMATION FOR SEQ ID NO: 96: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 257 amino acids 

(B) TYPE: amino acid 
50 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Met He Phe Asp Thr His Thr His Leu Asn Val Glu Glu Phe Ala Gly 
15 10 15 



Arg Glu Ala Glu Glu He Ala Leu Ala Ala Glu Met Gly Val Thr Gin 
60 20 25 30 
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Met Asn He Val Gly Phe Asp Lys Pro Thr He Glu His Ala Leu Glu 
35 40 45 

Leu Val Asp Glu Tyr Glu Gin Leu Tyr Ala Thr He Gly Trp His Pro 
5 50 55 60 

Thr Glu Ala Gly Thr Tyr Thr Glu Glu Val Glu Ala Tyr Leu Leu Asp 
65 70 75 80 

10 Lys Leu Lys His Ser Lys Val Val Ala Leu Gly Glu He Gly Leu Asp 

85 90 95 



15 



Tyr His Trp Met Thr Ala Pro Lys Glu Val Gin Glu Gin Val Phe Arg 
100 105 HO 

Arg Gin He Gin Leu Ser Lys Asp Leu Asp Leu Pro Phe Val Val His 
115 120 125 



Thr Arg Asp Ala Leu Glu Asp Thr Tyr Glu He He Lys Ser Glu Gly 
20 130 135 140 

Val Gly Pro Arg Gly Gly He Met His Ser Phe Ser Gly Thr Leu Glu 
145 150 155 160 

25 Trp Ala Glu Lys Phe Val Asp Leu Gly Met Thr He Ser Phe Ser Gly 

165 170 175 

Val Val Thr Phe Lys Lys Ala Thr Asp Leu Gin Glu Ala Ala Lys Glu 
180 185 190 

30 

Leu Pro Leu Asp Lys Met Leu Val Glu Thr Asp Ala Pro Tyr Leu Ala 
195 200 205 

Pro Val Pro Lys Arg Gly Arg Glu Asn Lys Thr Ala Tyr Thr Arg Tyr 
35 210 215 220 

Val Val Asp Phe He Ala Asp Leu Arg Gly Met Thr Thr Glu Glu Leu 
225 230 235 240 

40 Ala Val Ala Thr Thr Ala Asn Ala Glu Arg He Phe Gly Leu Asp Ser 

245 250 255 



45 



Lys 



(2) INFORMATION FOR SEQ ID NO: 97: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



55 



60 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(ix) FEATURE: 
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(A) NAME/ KEY: CDS 

(B) LOCATION: 1..1959 

(D) OTHER INFORMATION: Ligase 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

ATG AAT AAA AGA ATG AAT GAG TTA GTC GCT TTG CTC AAT CGC TAT GCG 48 
Met Asn Lys Arg Met Asn Glu Leu Val Ala Leu Leu Asn Arg Tyr Ala 
10 1 5 10 15 

ACT GAG TAC TAT ACC AGC GAT AAT CCC TCG GTT TCA GAC AGT GAG TAT 96 
Thr Glu Tyr Tyr Thr Ser Asp Asn Pro Ser Val Ser Asp Ser Glu Tyr 
20 25 30 



15 



35 



GAC CGC CTT TAC CGT GAG TTG GTC GAG TTA GAA ACT GCT TAT CCA GAG 144 
Asp Arg Leu Tyr Arg Glu Leu Val Glu Leu Glu Thr Ala Tyr Pro Glu 
35 40 45 



2 0 GAA GTG CTA GCA GAC AGT CCG ACT CAT CGT GTT GGT GGC AAG GTT TTA 192 

Gin Val Leu Ala Asp Ser Pro Thr His Arg Val Gly Gly Lys Val Leu 

50 55 60 

GAT GGT TTT GAA AAA TAC AGT CAT CAG TAT CCT CTT TAT AGT TTG CAG 240 

25 Asp Gly Phe Glu Lys Tyr Ser His Gin Tyr Pro Leu Tyr Ser Leu Gin 

65 70 75 80 

GAT GCT TTT TCA CGT GAG GAG CTA GAT GCT TTT GAT GCG CGT GTT CGT 288 

Asp Ala Phe Ser Arg Glu Glu Leu Asp Ala Phe Asp Ala Arg Val Arg 

30 85 90 95 

AAG GAA GTG GCT CAT CCG ACC TAT ATT TGT GAG CTG AAA ATC GAT GGC 336 

Lys Glu Val Ala His Pro Thr Tyr lie Cys Glu Leu Lys lie Asp Gly 

100 105 110 



TTA TCT ATC TCG CTG ACT TAT GAA AAG GGG ATT TTG GTT GCT GGG GTA 384 
Leu Ser lie Ser Leu Thr Tyr Glu Lys Gly lie Leu Val Ala Gly Val 
115 120 125 



40 ACA CGT GGA GAT GGT TCA ATT GGT GAA AAT ATC ACA GAA AAC CTC AAG 432 
Thr Arg Gly Asp Gly Ser lie Gly Glu Asn lie Thr Glu Asn Leu Lys 
130 135 140 

CGT GTT AAG GAC ATC CCT TTG ACT TTG CCA GAA GAA CTA GAT ATC ACA 480 
45 Arg Val Lys Asp lie Pro Leu Thr Leu Pro Glu Glu Leu Asp lie Thr 
145 150 155 160 

GTT CGT GGG GAA TGT TAC ATG CCA CGC GCT TCC TTT GAC CAA GTT AAC 528 
Val Arg Gly Glu Cys Tyr Met Pro Arg Ala Ser Phe Asp Gin Val Asn 
50 165 170 175 

CAA GCG CGC CAA GAA AAT GGA GAG CCT GAA TTT GCT AAT CCT CGT AAT 576 

Gin Ala Arg Gin Glu Asn Gly Glu Pro Glu Phe Ala Asn Pro Arg Asn 

180 185 190 

55 

GCG GCA GCA GGA ACT CTG CGT CAG TTG GAT ACA GCA GTA GTT GCC AAG 624 

Ala Ala Ala Gly Thr Leu Arg Gin Leu Asp Thr Ala Val Val Ala Lys 

195 200 205 

60 CGT AAT CTT GCA ACG TTT CTC TAT CAA GAA GCC AGC CCT TCA ACT CGT 672 
Arg Asn Leu Ala Thr Phe Leu Tyr Gin Glu Ala Ser Pro Ser Thr Arg 
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lo 



210 215 220 

GAT AGC CAA GAA AAG GGT TTG AAG TAC CTA GAA CAA CTA GGT TTT GTG 720 

Asp Ser Gin Glu Lys Gly Leu Lys Tyr Leu Glu Gin Leu Gly Phe Val 

225 230 235 240 

GTC AAT CCT AAG CGA ATC TTG GCT GAA AAC ATA GAT GAA ATC TGG AAT 768 

Val Asn Pro Lys Arg lie Leu Ala Glu Asn lie Asp Glu lie Trp Asn 
245 250 255 

TTT ATC CAA GAA GTA GGA CAG GAA CGG GAA AAT CTG CCT TAC GAT ATT 816 

Phe lie Gin Glu Val Gly Gin Glu Arg Glu Asn Leu Pro Tyr Asp lie 
260 265 270 

15 GAT GGA GTG GTA ATC AAG GTC AAC GAC CTA GCA AGT CAA GAA GAA CTT 864 

Asp Gly Val Val lie Lys Val Asn Asp Leu Ala Ser Gin Glu Glu Leu 
275 280 285 

GGT TTT ACC GTT AAG GCT CCA AAG TGG GCA GTA GCC TAC AAG TTC CCT 912 

20 Gly Phe Thr Val Lys Ala Pro Lys Trp Ala Val Ala Tyr Lys Phe Pro 

290 295 300 

GCT GAA GAA AAA GAA GCT CAA CTC TTA TCA GTT GAC TGG ACA GTT GGC 960 

Ala Glu Glu Lys Glu Ala Gin Leu Leu Ser Val Asp Trp Thr Val Gly 

25 305 310 315 320 

CGT ACC GGT GTT GTA ACT CCA ACT GCT AAT CTA ACA CCA GTA CAA CTT 1008 

Arg Thr Gly Val Val Thr Pro Thr Ala Asn Leu Thr Pro Val Gin Leu 
325 330 335 



30 



GCC GGT ACG ACT GTT AGC CGT GCG ACC CTG CAC AAT GTA GAT TAT ATT 1056 
Ala Gly Thr Thr Val Ser Arg Ala Thr Leu His Asn Val Asp Tyr lie 
340 345 350 



35 GCT GAA AAA GAT ATC CGA AAA GAC GAT ACG GTC ATT GTA TAT AAG GCT 1104 

Ala Glu Lys Asp lie Arg Lys Asp Asp Thr Val lie Val Tyr Lys Ala 

355 360 365 

GGT GAC ATC ATC CCT GCC GTT TTA CGT GTG GTA GAG TCC AAA CGG GTT 1152 

40 Gly Asp He He Pro Ala Val Leu Arg Val Val Glu Ser Lys Arg Val 

370 375 380 

TCT GAA GAA AAA CTA GAT ATC CCT ACA AAC TGT CCA AGT TGT AAC TCT 1200 

Ser Glu Glu Lys Leu Asp He Pro Thr Asn Cys Pro Ser Cys Asn Ser 

45 385 390 395 400 

GAC TTG TTG CAC TTT GAA GAT GAA GTG GCC CTA CGT TGT ATC AAT CCG 1248 

Asp Leu Leu His Phe Glu Asp Glu Val Ala Leu Arg Cys He Asn Pro 
405 410 415 

50 

CGT TGC CCT GCT CAA ATC ATG GAA GGC TTG ATT CAC TTT GCT TCT CGT 1296 

Arg Cys Pro Ala Gin He Met Glu Gly Leu He His Phe Ala Ser Arg 
420 425 430 

55 GAT GCT ATG AAT ATT ACA GGC CTT GGT CCA TCT ATT GTT GAG AAG CTT 1344 

Asp Ala Met Asn He Thr Gly Leu Gly Pro Ser He Val Glu Lys Leu 

435 440 445 

TTT GCT GCT AAT TTA GTC AAG GAT GTG GCG GAT ATT TAT CGT TTG CAA 1392 

60 Phe Ala Ala Asn Leu Val Lys Asp Val Ala Asp He Tyr Arg Leu Gin 

450 455 460 
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GAA GAG GAT TTC CTC CTT TTA GAG GGG GTT AAG GAA AAG TCC GCT GCT 1440 
Glu Glu Asp Phe Leu Leu Leu Glu Gly Val Lys Glu Lys Ser Ala Ala 
465 470 475 480 

AAA CTG TAT CAG GCT ATC CAA GCA TCA AAG GAA AAT TCT GCC GAG AAG 1488 
Lys Leu Tyr Gin Ala lie Gin Ala Ser Lys Glu Asn Ser Ala Glu Lys 
485 490 495 

CTC TTA TTT GGT TTG GGA ATT CGT CAT GTC GGA AGC AAG GCT AGT CAG 1536 
Leu Leu Phe Gly Leu Gly lie Arg His Val Gly Ser Lys Ala Ser Gin 
500 505 510 

CTT TTA CTT CAA TAT TTC CAT TCA ATT GAA AAT CTG TAT CAG GCA GAT 1584 
Leu Leu Leu Gin Tyr Phe His Ser lie Glu Asn Leu Tyr Gin Ala Asp 
515 520 525 

TCA GAG GAA GTG GCT AGT ATT GAA AGT CTA GGT GGC GTG ATT GCC AAA 1632 
Ser Glu Glu Val Ala Ser He Glu Ser Leu Gly Gly Val He Ala Lys 
530 535 540 

AGT CTT CAG ACT TAT TTT GCG GCA GAA GGC TCT GAA ATT CTG CTC AGA 1680 
Ser Leu Gin Thr Tyr Phe Ala Ala Glu Gly Ser Glu He Leu Leu Arg 
545 550 555 560 

GAA TTG AAA GAA ACT GGG GTC AAT CTG GAC TAT AAA GGA CAG ACG GTA 1728 
Glu Leu Lys Glu Thr Gly Val Asn Leu Asp Tyr Lys Gly Gin Thr Val 
565 570 575 

GTA GCG GAT GCG GCC TTG TCA GGT TTG ACC GTG GTA TTG ACA GGA AAA 1776 
Val Ala Asp Ala Ala Leu Ser Gly Leu Thr Val Val Leu Thr Gly Lys 
580 585 590 

TTG GAA CGA CTC AAG CGC TCA GAA GCT AAA AGT AAA CTC GAA AGT CTG 1824 
Leu Glu Arg Leu Lys Arg Ser Glu Ala Lys Ser Lys Leu Glu Ser Leu 
595 600 605 

GGT GCC AAA GTG ACA GGT AGT GTT TCT AAA AAG ACC GAC CTC GTC GTG 1872 
Gly Ala Lys Val Thr Gly Ser Val Ser Lys Lys Thr Asp Leu Val Val 
610 615 620 

GTA GGT GCA GAC GCT GGA AGT AAA CTG CAA AAA GCA CAA GAA CTT GGT 1920 
Val Gly Ala Asp Ala Gly Ser Lys Leu Gin Lys Ala Gin Glu Leu Gly 
625 630 635 640 

ATC CAG GTC AGA GAT GAG GCA TGG CTA GAA AGT TTG TAA 1959 
He Gin Val Arg Asp Glu Ala Trp Leu Glu Ser Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO:98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 653 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
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Met Asn Lys Arg Met Asn Glu Leu Val Ala Leu Leu Asn Arg Tyr Ala 
15 10 15 

Thr Glu Tyr Tyr Thr Ser Asp Asn Pro Ser Val Ser Asp Ser Glu Tyr 
5 20 25 30 

Asp Arg Leu Tyr Arg Glu Leu Val Glu Leu Glu Thr Ala Tyr Pro Glu 
35 40 45 

10 Gin Val Leu Ala Asp Ser Pro Thr His Arg Val Gly Gly Lys Val Leu 
50 55 60 

Asp Gly Phe Glu Lys Tyr Ser His Gin Tyr Pro Leu Tyr Ser Leu Gin 
65 70 75 80 

15 

Asp Ala Phe Ser Arg Glu Glu Leu Asp Ala Phe Asp Ala Arg Val Arg 
85 90 95 

Lys Glu Val Ala His Pro Thr Tyr lie Cys Glu Leu Lys lie Asp Gly 
20 100 105 110 

Leu Ser He Ser Leu Thr Tyr Glu Lys Gly He Leu Val Ala Gly Val 
115 120 125 

25 Thr Arg Gly Asp Gly Ser He Gly Glu Asn He Thr Glu Asn Leu Lys 
130 135 140 

Arg Val Lys Asp He Pro Leu Thr Leu Pro Glu Glu Leu Asp lie Thr 
145 150 155 160 

30 

Val Arg Gly Glu Cys Tyr Met Pro Arg Ala Ser Phe Asp Gin Val Asn 
165 170 175 

Gin Ala Arg Gin Glu Asn Gly Glu Pro Glu Phe Ala Asn Pro Arg Asn 
35 180 185 190 

Ala Ala Ala Gly Thr Leu Arg Gin Leu Asp Thr Ala Val Val Ala Lys 
195 200 205 

40 Arg Asn Leu Ala Thr Phe Leu Tyr Gin Glu Ala Ser Pro Ser Thr Arg 
210 215 220 

Asp Ser Gin Glu Lys Gly Leu Lys Tyr Leu Glu Gin Leu Gly Phe Val 
225 230 235 240 

45 

Val Asn Pro Lys Arg He Leu Ala Glu Asn He Asp Glu He Trp Asn 
245 250 255 

Phe He Gin Glu Val Gly Gin Glu Arg Glu Asn Leu Pro Tyr Asp He 
50 260 265 270 

Asp Gly Val Val He Lys Val Asn Asp Leu Ala Ser Gin Glu Glu Leu 
275 280 285 

55 Gly Phe Thr Val Lys Ala Pro Lys Trp Ala Val Ala Tyr Lys Phe Pro 
290 295 300 

Ala Glu Glu Lys Glu Ala Gin Leu Leu Ser Val Asp Trp Thr Val Gly 

305 310 315 320 

60 

Arg Thr Gly Val Val Thr Pro Thr Ala Asn Leu Thr Pro Val Gin Leu 
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325 330 335 

Ala Gly Thr Thr Val Ser Arg Ala Thr Leu His Asn Val Asp Tyr lie 
340 345 350 

5 

Ala Glu Lys Asp lie Arg Lys Asp Asp Thr Val lie Val Tyr Lys Ala 
355 360 365 

Gly Asp He He Pro Ala Val Leu Arg Val Val Glu Ser Lys Arg Val 
10 370 375 380 

Ser Glu Glu Lys Leu Asp He Pro Thr Asn Cys Pro Ser Cys Asn Ser 
385 390 395 400 

15 Asp Leu Leu His Phe Glu Asp Glu Val Ala Leu Arg Cys He Asn Pro 

405 410 415 

Arg Cys Pro Ala Gin He Met Glu Gly Leu He His Phe Ala Ser Arg 
420 425 430 

20 

Asp Ala Met Asn He Thr Gly Leu Gly Pro Ser He Val Glu Lys Leu 
435 440 445 

Phe Ala Ala Asn Leu Val Lys Asp Val Ala Asp He Tyr Arg Leu Gin 
25 450 455 460 

Glu Glu Asp Phe Leu Leu Leu Glu Gly Val Lys Glu Lys Ser Ala Ala 
465 470 475 480 

3 0 Lys Leu Tyr Gin Ala He Gin Ala Ser Lys Glu Asn Ser Ala Glu Lys 

485 490 495 

Leu Leu Phe Gly Leu Gly He Arg His Val Gly Ser Lys Ala Ser Gin 
500 505 510 

35 

Leu Leu Leu Gin Tyr Phe His Ser He Glu Asn Leu Tyr Gin Ala Asp 
515 520 525 

Ser Glu Glu Val Ala Ser He Glu Ser Leu Gly Gly Val He Ala Lys 
40 530 535 540 

Ser Leu Gin Thr Tyr Phe Ala Ala Glu Gly Ser Glu He Leu Leu Arg 
545 550 555 560 

45 Glu Leu Lys Glu Thr Gly Val Asn Leu Asp Tyr Lys Gly Gin Thr Val 

565 570 575 

Val Ala Asp Ala Ala Leu Ser Gly Leu Thr Val Val Leu Thr Gly Lys 
580 585 590 

50 

Leu Glu Arg Leu Lys Arg Ser Glu Ala Lys Ser Lys Leu Glu Ser Leu 
595 600 605 

Gly Ala Lys Val Thr Gly Ser Val Ser Lys Lys Thr Asp Leu Val Val 
55 610 615 620 

Val Gly Ala Asp Ala Gly Ser Lys Leu Gin Lys Ala Gin Glu Leu Gly 
625 630 635 .640 

60 He Gin Val Arg Asp Glu Ala Trp Leu Glu Ser Leu 

645 650 



10 



55 
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(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 981 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
15 (iv) ANTI-SENSE: NO 

(ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..981 

20 (D) OTHER INFORMATION: MraY 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

ATG TTT ATT TCC ATC AGT GCT GGA ATT GTG ACA TTT TTA CTA ACT TTA 48 
25 Met Phe He Ser He Ser Ala Gly He Val Thr Phe Leu Leu Thr Leu 
15 10 15 

GTA GGA ATT CCG GCC TTT ATC CAA TTT TAT AGA AAG GCG CAA ATT ACA 96 
Val Gly He Pro Ala Phe He Gin Phe Tyr Arg Lys Ala Gin He Thr 
30 20 25 30 

GGC CAG CAG ATG CAT GAG GAT GTC AAA CAG CAT CAG GCA AAA GCT GGG 144 

Gly Gin Gin Met His Glu Asp Val Lys Gin His Gin Ala Lys Ala Gly 

35 40 45 

35 

ACT CCT ACA ATG GGA GGT TTG GTT TTC TTG ATT ACT TCT GTT TTG GTT 192 

Thr Pro Thr Met Gly Gly Leu Val Phe Leu He Thr Ser Val Leu Val 

50 55 60 

40 GCT TTC TTT TTC GCC CTA TTT AGT AGC CAA TTC AGC AAT AAT GTG GGA 240 
Ala Phe Phe Phe Ala Leu Phe Ser Ser Gin Phe Ser Asn Asn Val Gly 
65 70 75 80 

ATG ATT TTG TTC ATC TTG GTC TTG TAT GGC TTG GTC GGA TTT TTA GAT 288 
45 Met He Leu Phe He Leu Val Leu Tyr Gly Leu Val Gly Phe Leu Asp 

85 90 95 

GAC TTT CTC AAG GTC TTT CGT AAA ATC AAT GAG GGG CTT AAT CCT AAG 336 
Asp Phe Leu Lys Val Phe Arg Lys He Asn Glu Gly Leu Asn Pro Lys 
50 100 105 110 

CAA AAA TTA GCT CTT CAG CTT CTA GGT GGA GTT ATC TTC TAT CTT TTC 384 
Gin Lys Leu Ala Leu Gin Leu Leu Gly Gly Val He Phe Tyr Leu Phe 
115 120 125 



TAT GAG CGC GGT GGC GAT ATC CTG TCT GTC TTT GGT TAT CCA GTT CAT 432 
Tyr Glu Arg Gly Gly Asp He Leu Ser Val Phe Gly Tyr Pro Val His 
130 135 140 



60 TTG GGA TTT TTC TAT ATT TTC TTC GCT CTT TTC TGG CTA GTC GGT TTT 
Leu Gly Phe Phe Tyr He Phe Phe Ala Leu Phe Trp Leu Val Gly Phe 



480 
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145 150 155 160 

TCA AAC GCA GTA AAC TTG ACA GAC GGT GTT GAC GGT TTA GCT AGT ATT 528 
Ser Asn Ala Val Asn Leu Thr Asp Gly Val Asp Gly Leu Ala Ser lie 
165 170 175 

TCC GTT GTG ATT AGT TTG TTT GCC TAT GGA GTT ATT GCC TAT GTG CAA 576 
Ser Val Val He Ser Leu Phe Ala Tyr Gly Val He Ala Tyr Val Gin 
180 185 190 

GGT CAG ATG GAT ATT CTT CTA GTG ATT CTT GCC ATG ATT GGT GGT TTG 624 
Gly Gin Met Asp He Leu Leu Val He Leu Ala Met He Gly Gly Leu 
195 200 205 

15 CTC GGT TTC TTC ATC TTT AAC CAT AAG CCT GCC AAG GTC TTT ATG GGT 672 
Leu Gly Phe Phe He Phe Asn His Lys Pro Ala Lys Val Phe Met Gly 
210 215 220 

GAT GTG GGA AGT TTG GCC CTA GGT GGG ATG CTG GCA GCT ATC TCT ATG 720 
20 Asp Val Gly Ser Leu Ala Leu Gly Gly Met Leu Ala Ala He Ser Met 
225 230 235 240 

GCT CTC CAC CAG GAA TGG ACT CTC TTG ATT ATC GGA ATT GTG TAT GTT 768 
Ala Leu His Gin Glu Trp Thr Leu Leu He He Gly He Val Tyr Val 
25 245 250 255 

TTT GAA ACA ACT TCT GTT ATG ATG CAA GTC AGT TAT TTC AAA CTG ACA 816 

Phe Glu Thr Thr Ser Val Met Met Gin Val Ser Tyr Phe Lys Leu Thr 
260 265 270 

30 

GGT GGT AAA CGT ATT TTC CGT ATG ACG CCT GTA CAT CAC CAT TTT GAG 864 

Gly Gly Lys Arg He Phe Arg Met Thr Pro Val His His His Phe Glu 
275 280 285 

35 CTT GGG GGA TTG TCT GGT AAA GGA AAT CCT TGG AGC GAG TGG AAG GTT 912 
Leu Gly Gly Leu Ser Gly Lys Gly Asn Pro Trp Ser Glu Trp Lys Val 
290 295 300 

GAC TTC TTC TTT TGG GGA GTT GGG CTT CTA GCA AGT CTC CTG ACC CTC 960 
40 Asp Phe Phe Phe Trp Gly Val Gly Leu Leu Ala Ser Leu Leu Thr Leu 
305 310 315 320 

GCA ATT TTG TAT TTG ATG TAA 981 
Ala He Leu Tyr Leu Met 
45 325 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 326 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

60 Met Phe He Ser He Ser Ala Gly He Val Thr Phe Leu Leu Thr Leu 
15 10 15 



50 



55 
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Val Gly He Pro Ala Phe He Gin Phe Tyr Arg Lys Ala Gin He Thr 

20 25 30 

Gly Gin Gin Met His Glu Asp Val Lys Gin His Gin Ala Lys Ala Gly 
35 40 45 



10 



15 



Thr Pro Thr Met Gly Gly Leu Val Phe Leu He Thr Ser Val Leu Val 
50 55 60 

Ala Phe Phe Phe Ala Leu Phe Ser Ser Gin Phe Ser Asn Asn Val Gly 
65 70 75 80 

Met He Leu Phe He Leu Val Leu Tyr Gly Leu Val Gly Phe Leu Asp 
85 90 95 



20 



25 



30 



35 



40 



45 



50 



55 



60 



Asp Phe Leu Lys Val Phe Arg Lys He Asn Glu Gly Leu Asn Pro Lys 
100 105 HO 

Gin Lys Leu Ala Leu Gin Leu Leu Gly Gly Val He Phe Tyr Leu Phe 
115 120 125 

Tyr Glu Arg Gly Gly Asp He Leu Ser Val Phe Gly Tyr Pro Val His 
130 135 140 

Leu Gly Phe Phe Tyr He Phe Phe Ala Leu Phe Trp Leu Val Gly Phe 
145 150 155 160 

Ser Asn Ala Val Asn Leu Thr Asp Gly Val Asp Gly Leu Ala Ser He 
165 170 175 

Ser Val Val He Ser Leu Phe Ala Tyr Gly Val He Ala Tyr Val Gin 
180 185 190 

Gly Gin Met Asp He Leu Leu Val He Leu Ala Met He Gly Gly Leu 
195 200 205 

Leu Gly Phe Phe He Phe Asn His Lys Pro Ala Lys Val Phe Met Gly 
210 215 220 

Asp Val Gly Ser Leu Ala Leu Gly Gly Met Leu Ala Ala He Ser Met 
225 230 235 240 

Ala Leu His Gin Glu Trp Thr Leu Leu He He Gly He Val Tyr Val 
245 250 255 

Phe Glu Thr Thr Ser Val Met Met Gin Val Ser Tyr Phe Lys Leu Thr 
260 265 270 

Gly Gly Lys Arg He Phe Arg Met Thr Pro Val His His His Phe Glu 
275 280 285 

Leu Gly Gly Leu Ser Gly Lys Gly Asn Pro Trp Ser Glu Trp Lys Val 
290 295 300 

Asp Phe Phe Phe Trp Gly Val Gly Leu Leu Ala Ser Leu Leu Thr Leu 
305 310 315 320 

Ala He Leu Tyr Leu Met 
325 
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(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 369 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(DJ TOPOLOGY : linear 

10 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

15 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..366 

20 (D) OTHER INFORMATION: Dpj 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:101: 

25 ATG AGA ATG ATA GTT GGA CAC GGA ATT GAC ATC GAA GAA TTG GCT TCG 48 
Met Arg Met He Val Gly His Gly He Asp He Glu Glu Leu Ala Ser 
15 10 15 

ATA GAA AGC GCA GTT ACA CGA CAT GAA GGA TTT GCT AAG CGT GTA CTG 96 
30 He Glu Ser Ala Val Thr Arg His Glu Gly Phe Ala Lys Arg Val Leu 
20 25 30 

ACC GCT CAG GAA ATG GAG CGC TTC ACC AGT CTC AAA GGA CGC AGG CAA 144 
Thr Ala Gin Glu Met Glu Arg Phe Thr Ser Leu Lys Gly Arg Arg Gin 
35 35 .40 45 

ATA GAA TAT TTA GCT GGT CGC TGG TCG GCT AAG GAG GCC TTT TCC AAG 192 
He Glu Tyr Leu Ala Gly Arg Trp Ser Ala Lys Glu Ala Phe Ser Lys 

50 55 60 

40 

GCT ATG GGA ACG GGC ATT AGC AAG CTC GGT TTT CAG GAT TTG GAA GTC 240 

Ala Met Gly Thr Gly He Ser Lys Leu Gly Phe Gin Asp Leu Glu Val 

65 70 75 80 

45 TTG AAC AAT GAA CGT GGG GCG CCT TAT TTT AGT CAG GCA CCA TTT TCA 288 
Leu Asn Asn Glu Arg Gly Ala Pro Tyr Phe Ser Gin Ala Pro Phe Ser 
85 90 95 

GGA AAG ATT TGG CTG TCT ATC AGC CAC ACC GAT CAG TTT GTG ACA GCC 336 
50 Gly Lys He Trp Leu Ser He Ser His Thr Asp Gin Phe Val Thr Ala 
100 105 HO 

AGT GTC ATT TTG GAG GAA AAT CAT GAA AGC TAG 369 
Ser Val He Leu Glu Glu Asn His Glu Ser 
55 H5 120 

(2) INFORMATION FOR SEQ ID NO: 102: 

60 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 122 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Met Arg Met lie Val Gly His Gly lie Asp lie Glu Glu Leu Ala Ser 
15 10 15 

lie Glu Ser Ala Val Thr Arg His Glu Gly Phe Ala Lys Arg Val Leu 
20 25 30 

Thr Ala Gin Glu Met Glu Arg Phe Thr Ser Leu Lys Gly Arg Arg Gin 
35 40 45 

lie Glu Tyr Leu Ala Gly Arg Trp Ser Ala Lys Glu Ala Phe Ser Lys 
50 55 60 

Ala Met Gly Thr Gly He Ser Lys Leu Gly Phe Gin Asp Leu Glu Val 
65 70 75 80 

Leu Asn Asn Glu Arg Gly Ala Pro Tyr Phe Ser Gin Ala Pro Phe Ser 
85 90 95 

Gly Lys He Trp Leu Ser He Ser His Thr Asp Gin Phe Val Thr Ala 
100 105 110 

Ser Val He Leu Glu Glu Asn His Glu Ser 
115 120 



(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1260 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1260 

(D) OTHER INFORMATION: MurZ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

ATG AGA AAA ATT GTT ATC AAT GGT GGA TTA CCA CTG CAA GGT GAA ATC 4 8 

Met Arg Lys He Val He Asn Gly Gly Leu Pro Leu Gin Gly Glu He 
15 10 15 



ACT ATT AGT GGT GCT AAA AAT AGT GTC GTT GCC TTA ATT CCA GCT ATT 
Thr He Ser Gly Ala Lys Asn Ser Val Val Ala Leu He Pro Ala He 



96 



10 
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20 25 30 

ATC TTG GCT GAT GAT GTG GTG ACT TTG GAT TGC GTT CCA GAT ATT TCG 144 
lie Leu Ala Asp Asp Val Val Thr Leu Asp Cys Val Pro Asp lie Ser 
35 40 45 

GAT GTA GCC AGT CTT GTC GAA ATC ATG GAA TTG ATG GGA GCT ACT GTT 192 
Asp Val Ala Ser Leu Val Glu lie Met Glu Leu Met Gly Ala Thr Val 
50 55 60 

AAG CGT TAT GAC GAT GTA TTG GAG ATT GAC CCA AGA GGT GTT CAA AAT 240 
Lys Arg Tyr Asp Asp Val Leu Glu lie Asp Pro Arg Gly Val Gin Asn 
65 70 75 80 

15 ATT CCA ATG CCT TAT GGT AAA ATT AAC AGT CTT CGT GCA TCT TAC TAT 288 
lie Pro Met Pro Tyr Gly Lys lie Asn Ser Leu Arg Ala Ser Tyr Tyr 
85 90 95 

TTT TAT GGG AGC CTC TTA GGC CGT TTT GGT GAA GCG ACA GTT GGT CTA 336 
20 Phe Tyr Gly Ser Leu Leu Gly Arg Phe Gly Glu Ala Thr Val Gly Leu 
100 105 110 

CCG GGA GGA TGT GAT CTT GGT CCT CGT CCG ATT GAC TTA CAC CTT AAG 384 
Pro Gly Gly Cys Asp Leu Gly Pro Arg Pro lie Asp Leu His Leu Lys 
25 115 120 125 

GCG TTT GAA GCT ATG GGT GCC ACT GCT AGC TAC GAG GGA GAT AAC ATG 432 
Ala Phe Glu Ala Met Gly Ala Thr Ala Ser Tyr Glu Gly Asp Asn Met 
130 135 140 

30 

AAG TTA TCT GCT AAA GAT ACA GGA CTT CAT GGT GCA AGT ATT TAC ATG 480 
Lys Leu Ser Ala Lys Asp Thr Gly Leu His Gly Ala Ser lie Tyr Met 
145 150 155 160 

35 GAT ACG GTT AGT GTG GGA GCA ACG ATT AAT ACG ATG ATT GCT GCG GTT 528 
Asp Thr Val Ser Val Gly Ala Thr He Asn Thr Met He Ala Ala Val 
165 170 175 

AAA GCA AAT GGT CGT ACT ATT ATT GAA AAT GCA GCC CGT GAA CCT GAG 576 
40 Lys Ala Asn Gly Arg Thr He He Glu Asn Ala Ala Arg Glu Pro Glu 
180 185 190 

ATT ATT GAT GTA GCT ACT CTC TTG AAT AAT ATG GGT GCC CAT ATC CGT 624 
He He Asp Val Ala Thr Leu Leu Asn Asn Met Gly Ala His He Arg 
45 195 200 205 

GGG GCA GGA ACT AAT ATC ATC ATT ATT GAT GGT GTT GAA AGA TTA CAT 672 
Gly Ala Gly Thr Asn He He He He Asp Gly Val Glu Arg Leu His 
210 215 220 

50 

GGG ACA CGT CAT CAG GTG ATT CCA GAC CGC ATT GAA GCT GGA ACA TAT 720 
Gly Thr Arg His Gin Val He Pro Asp Arg He Glu Ala Gly Thr Tyr 
225 230 235 240 

55 ATA TCT TTA GCT GCT GCA GTT GGT AAA GGA ATT CGT ATA AAT AAT GTT 768 
He Ser Leu Ala Ala Ala Val Gly Lys Gly He Arg He Asn Asn Val 
245 250 255 

CTT TAC GAA CAC CTG GAA GGG TTT GTT GCT AAG TTG GAA GAA ATG GGA 816 
60 Leu Tyr Glu His Leu Glu Gly Phe Val Ala Lys Leu Glu Glu Met Gly 
260 265 270 
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GTG AGA ATG ACT GTA TCT GAA GAC AGC ATT TTT GTC GAG GAA CAG TCT 864 

Val Arg Met Thr Val Ser Glu Asp Ser lie Phe Val Glu Glu Gin Ser 

275 280 285 

5 

AAT TTG AAA GCA ATC AAT ATT AAG ACA GCT CCT TAC CCA GGC TTT GCA 912 

Asn Leu Lys Ala lie Asn lie Lys Thr Ala Pro Tyr Pro Gly Phe Ala 

290 295 300 

10 ACT GAT TTG CAA CAA CCG CTT ACC CCT CTT TTA CTA AGA GCG AAT GGT 960 
Thr Asp Leu Gin Gin Pro Leu Thr Pro Leu Leu Leu Arg Ala Asn Gly 
305 310 315 320 

CGT GGT ACA ATT GTC GAT ACG ATT TAC GAA AAA CGT GTA AAT CAT GTT 1008 
15 Arg Gly Thr lie Val Asp Thr lie Tyr Glu Lys Arg Val Asn His Val 

325 330 335 

TTT GAA CTA GCA AAG ATG GAT GCG GAT ATT TCG ACA ACA AAT GGT CAT 1056 
Phe Glu Leu Ala Lys Met Asp Ala Asp lie Ser Thr Thr Asn Gly His 
20 340 345 350 

ATT TTG TAC ACG GGT GGA CGT GAT TTA CGT GGG GCC AGT GTT AAA GCG 1104 

lie Leu Tyr Thr Gly Gly Arg Asp Leu Arg Gly Ala Ser Val Lys Ala 

355 360 365 

25 

ACC GAC TTA AGA GCT GGG GCT GCA CTA GTC ATT GCT GGG CTT ATG GCT 1152 

Thr Asp Leu Arg Ala Gly Ala Ala Leu Val lie Ala Gly Leu Met Ala 

370 375 380 

30 GAA GGC AAA ACT GAA ATT ACC AAT ATC GAG TTT ATC TTA CGT GGT TAT 1200 
Glu Gly Lys Thr Glu lie Thr Asn lie Glu Phe lie Leu Arg Gly Tyr 
385 390 395 400 

TCT GAT ATT ATC GAA AAA TTA CGT AAT TTA GGA GCG GAT ATT AGA CTT 1248 
35 Ser Asp lie lie Glu Lys Leu Arg Asn Leu Gly Ala Asp He Arg Leu 

405 410 415 

GTT GAG GAT TAA 1260 
Val Glu Asp 
40 419 

(2) INFORMATION FOR SEQ ID NO: 104: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 419 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Met Arg Lys He Val He Asn Gly Gly Leu Pro Leu Gin Gly Glu He 
55 1 5 10 15 

Thr He Ser Gly Ala Lys Asn Ser Val Val Ala Leu He Pro Ala He 
20 25 30 



60 



He Leu Ala Asp Asp Val Val Thr Leu Asp Cys Val Pro Asp He Ser 
35 40 45 
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Asp Val Ala Ser Leu Val Glu He Met Glu Leu Met Gly Ala Thr Val 
50 55 60 

Lys Arg Tyr Asp Asp Val Leu Glu He Asp Pro Arg Gly Val Gin Asn 
65 70 75 80 

He Pro Met Pro Tyr Gly Lys He Asn Ser Leu Arg Ala Ser Tyr Tyr 
85 90 95 

Phe Tyr Gly Ser Leu Leu Gly Arg Phe Gly Glu Ala Thr Val Gly Leu 
100 105 110 



Pro Gly Gly Cys Asp Leu Gly Pro Arg Pro He Asp Leu His Leu Lys 
15 115 120 125 



20 



Ala Phe Glu Ala Met Gly Ala Thr Ala Ser Tyr Glu Gly Asp Asn Met 
130 135 140 

Lys Leu Ser Ala Lys Asp Thr Gly Leu His Gly Ala Ser He Tyr Met 
145 150 155 160 



25 



Asp Thr Val Ser Val Gly Ala Thr He Asn Thr Met He Ala Ala Val 
165 170 175 

Lys Ala Asn Gly Arg Thr He He Glu Asn Ala Ala Arg Glu Pro Glu 
180 185 190 



He He Asp Val Ala Thr Leu Leu Asn Asn Met Gly Ala His He Arg 
30 195 200 205 



35 



Gly Ala Gly Thr Asn He He He He Asp Gly Val Glu Arg Leu His 

210 215 220 

Gly Thr Arg His Gin Val He Pro Asp Arg He Glu Ala Gly Thr Tyr 

225 230 235 240 



40 



He Ser Leu Ala Ala Ala Val Gly Lys Gly He Arg He Asn Asn Val 
245 250 255 

Leu Tyr Glu His Leu Glu Gly Phe Val Ala Lys Leu Glu Glu Met Gly 
260 265 270 



Val Arg Met Thr Val Ser Glu Asp Ser lie Phe Val Glu Glu Gin Ser 
45 275 280 285 



50 



Asn Leu Lys Ala He Asn He Lys Thr Ala Pro Tyr Pro Gly Phe Ala 
290 295 300 

Thr Asp Leu Gin Gin Pro Leu Thr Pro Leu Leu Leu Arg Ala Asn Gly 
305 310 315 320 



55 



Arg Gly Thr He Val Asp Thr He Tyr Glu Lys Arg Val Asn His Val 
325 330 335 

Phe Glu Leu Ala Lys Met Asp Ala Asp He Ser Thr Thr Asn Gly His 
340 345 350 



He Leu Tyr Thr Gly Gly Arg Asp Leu Arg Gly Ala Ser Val Lys Ala 
60 355 360 365 
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Thr Asp Leu Arg Ala Gly Ala Ala Leu Val He Ala Gly Leu Met Ala 
370 375 380 

Glu Gly Lys Thr Glu He Thr Asn He Glu Phe He Leu Arg Gly Tyr 
5 385 390 395 400 

Ser Asp He He Glu Lys Leu Arg Asn Leu Gly Ala Asp He Arg Leu 
405 410 415 

10 Val Glu Asp 
419 



15 



25 



40 



60 



(2) INFORMATION FOR SEQ ID NO: 105: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1008 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 
30 (A) NAME/ KEY: CDS 

(B) LOCATION: 1.-1008 

(D) OTHER INFORMATION: FtsZ 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

ATG ACA TTT TCA TTT GAT ACA GCT GCT GCT CAA GGG GCA GTG ATT AAA 48 
Met Thr Phe Ser Phe Asp Thr Ala Ala Ala Gin Gly Ala Val He Lys 
15 10 15 

GTA ATT GGT GTC GGT GGA GGT GGT GGC AAT GCC ATC AAC CGT ATG GTC 96 
Val He Gly Val Gly Gly Gly Gly Gly Asn Ala He Asn Arg Met Val 
20 25 30 



45 GAC GAA GGT GTT ACA GGC GTA GAA TTT ATC GCA GCA AAC ACA GAT GTA 144 
Asp Glu Gly Val Thr Gly Val Glu Phe He Ala Ala Asn Thr Asp Val 
35 40 45 

CAA GCA TTG AGT AGT ACA AAA GCT GAG ACT GTT ATT CAG TTG GGA CCT 192 
50 Gin Ala Leu Ser Ser Thr Lys Ala Glu Thr Val He Gin Leu Gly Pro 
50 55 60 

AAA TTG ACT CGT GGT TTG GGT GCA GGA GGT CAA CCT GAG GTT GGT CGT 240 
Lys Leu Thr Arg Gly Leu Gly Ala Gly Gly Gin Pro Glu Val Gly Arg 
55 65 70 75 80 

AAA GCC GCT GAA GAA AGC GAA GAA ACA CTG ACG GAA GCT ATT AGT GGT 288 
Lys Ala Ala Glu Glu Ser Glu Glu Thr Leu Thr Glu Ala He Ser Gly 
85 90 95 



GCC GAT ATG GTC TTC ATC ACT GCT GGT ATG GGA GGA GGC TCT GGA ACT 336 
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Ala Asp Met Val Phe He Thr Ala Gly Met Gly Gly Gly Ser Gly Thr 
100 105 110 

GGA GCT GCT CCT GTT ATT GCT CGT ATC GCC AAA GAT TTA GGT GCG CTT 384 
5 Gly Ala Ala Pro Val He Ala Arg He Ala Lys Asp Leu Gly Ala Leu 
115 120 125 

ACA GTT GGT GTT GTA ACA CGT CCC TTT GGT TTT GAA GGA AGT AAG CGT 432 
Thr Val Gly Val Val Thr Arg Pro Phe Gly Phe Glu Gly Ser Lys Arg 
10 130 135 140 

GGA CAA TTT GCT GTA GAA GGA ATC AAT CAA CTT CGT GAG CAT GTA GAC 480 
Gly Gin Phe Ala Val Glu Gly He Asn Gin Leu Arg Glu His Val Asp 
145 150 155 160 



15 



55 



60 



ACT CTA TTG ATT ATC TCA AAC AAC AAT TTG CTT GAA ATT GTT GAT AAG 528 
Thr Leu Leu He He Ser Asn Asn Asn Leu Leu Glu He Val Asp Lys 
165 170 175 



20 AAA ACA CCG CTT TTG GAG GCT CTT AGC GAA GCG GAT AAC GTT CTT CGT 576 
Lys Thr Pro Leu Leu Glu Ala Leu Ser Glu Ala Asp Asn Val Leu Arg 
180 185 190 

CAA GGT GTT CAA GGG ATT ACC GAT TTG ATT ACC AAT CCA GGA TTG ATT 624 
25 Gin Gly Val Gin Gly He Thr Asp Leu He Thr Asn Pro Gly Leu He 
195 200 205 

AAC CTT GAC TTT GCC GAT GTG AAA ACG GTA ATG GCA AAC AAA GGG AAT 672 
Asn Leu Asp Phe Ala Asp Val Lys Thr Val Met Ala Asn Lys Gly Asn 
30 210 215 220 

GCT CTT ATG GGT ATT GGT ATC GGT AGT GGA GAA GAA CGT GTG GTA GAA 720 

Ala Leu Met Gly He Gly He Gly Ser Gly Glu Glu Arg Val Val Glu 
225 230 235 240 

35 

GCG GCA CGT AAG GCA ATC TAT TCA CCA CTT CTT GAA ACA ACT ATT GAC 768 

Ala Ala Arg Lys Ala He Tyr Ser Pro Leu Leu Glu Thr Thr He Asp 
245 250 255 

40 GGT GCT GAG GAT GTT ATC GTC AAC GTT ACT GGT GGT CTT GAC TTA ACC 816 
Gly Ala Glu Asp Val He Val Asn Val Thr Gly Gly Leu Asp Leu Thr 
260 265 270 

TTG ATT GAG GCA GAA GAG GCT TCA CAA ATT GTG AAC CAG GCA GCA GGT 864 
45 Leu He Glu Ala Glu Glu Ala Ser Gin He Val Asn Gin Ala Ala Gly 
275 280 285 

CAA GGA GTG AAC ATC TGG CTC GGT ACT TCA ATT GAT GAA AGT ATG CGT 912 
Gin Gly Val Asn He Trp Leu Gly Thr Ser He Asp Glu Ser Met Arg 
50 290 295 300 

GAT GAA ATT CGT GTA ACA GTT GTC GCA ACG GGT GTT CGT CAA GAC CGC 960 
Asp Glu He Arg Val Thr Val Val Ala Thr Gly Val Arg Gin Asp Arg 
305 310 315 320 



GTA GAA AAG GTT GTG GCT CCA CAA GCT AGA TCA CCG CGC CTA GGA TAA 1008 

Val Glu Lys Val Val Ala Pro Gin Ala Arg Ser Pro Arg Leu Gly * 
325 330 335 

(2) INFORMATION FOR SEQ ID NO: 106: 



10 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 336 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

Met Thr Phe Ser Phe Asp Thr Ala Ala Ala Gin Gly Ala Val lie Lys 
15 10 15 



Val He Gly Val Gly Gly Gly Gly Gly Asn Ala He Asn Arg Met Val 
15 20 25 30 

Asp Glu Gly Val Thr Gly Val Glu Phe He Ala Ala Asn Thr Asp Val 
35 40 45 

20 Gin Ala Leu Ser Ser Thr Lys Ala Glu Thr Val He Gin Leu Gly Pro 
50 55 60 

Lys Leu Thr Arg Gly Leu Gly Ala Gly Gly Gin Pro Glu Val Gly Arg 
65 70 75 80 

25 

Lys Ala Ala Glu Glu Ser Glu Glu Thr Leu Thr Glu Ala He Ser Gly 
85 90 95 

Ala Asp Met Val Phe lie Thr Ala Gly Met Gly Gly Gly Ser Gly Thr 
30 100 105 110 

Gly Ala Ala Pro Val He Ala Arg He Ala Lys Asp Leu Gly Ala Leu 
115 120 125 

35 Thr Val Gly Val Val Thr Arg Pro Phe Gly Phe Glu Gly Ser Lys Arg 
130 135 140 

Gly Gin Phe Ala Val Glu Gly He Asn Gin Leu Arg Glu His Val Asp 
145 150 155 160 

40 

Thr Leu Leu He He Ser Asn Asn Asn Leu Leu Glu He Val Asp Lys 
165 170 175 

Lys Thr Pro Leu Leu Glu Ala Leu Ser Glu Ala Asp Asn Val Leu Arg 
45 180 185 190 

Gin Gly Val Gin Gly He Thr Asp Leu He Thr Asn Pro Gly Leu He 
195 200 205 

50 Asn Leu Asp Phe Ala Asp Val Lys Thr Val Met Ala Asn Lys Gly Asn 
210 215 220 

Ala Leu Met Gly He Gly He Gly Ser Gly Glu Glu Arg Val Val Glu 
225 230 235 240 

55 

Ala Ala Arg Lys Ala He Tyr Ser Pro Leu Leu Glu Thr Thr He Asp 
245 250 255 

Gly Ala Glu Asp Val He Val Asn Val Thr Gly Gly Leu Asp Leu Thr 
60 260 265 270 
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Leu He Glu Ala Glu Glu Ala Ser Gin He Val Asn Gin Ala Ala Gly 
275 280 285 

Gin Gly Val Asn He Trp Leu Gly Thr Ser He Asp Glu Ser Met Arg 
290 295 300 

Asp Glu He Arg Val Thr Val Val Ala Thr Gly Val Arg Gin Asp Arg 
305 310 315 320 

Val Glu Lys Val Val Ala Pro Gin Ala Arg Ser Pro Arg Leu Gly 
325 330 335 

(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 525 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

<iv) ANTI-SENSE: NO 



<ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..525 

(D) OTHER INFORMATION: grpE 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

ATG GCC CAA GAT ATA AAA AAT GAA GAA GTA GAA GAA GTT CAA GAA GAG 
Met Ala Gin Asp He Lys Asn Glu Glu Val Glu Glu Val Gin Glu Glu 
15 10 15 

GAA GTT GTG GAA ACA GCT GAA GAA ACA ACT CCT GAA AAG TCT GAG TTG 
Glu Val Val Glu Thr Ala Glu Glu Thr Thr Pro Glu Lys Ser Glu Leu 
20 25 30 

GAC TTG GCA AAT GAA CGT GCA GAT GAG TTC GAA AAC AAA TAT CTT CGC 
Asp Leu Ala Asn Glu Arg Ala Asp Glu Phe Glu Asn Lys Tyr Leu Arg 
35 40 45 

GCT CAT GCA GAA ATG CAA AAT ATC CAA CGC CGT GCC AAT GAA GAA CGT 
Ala His Ala Glu Met Gin Asn He Gin Arg Arg Ala Asn Giu Glu Arg 
50 55 60 

CAA AAC TTG CAA CGT TAT CGT AGC CAG GAC TTG GCA AAA GCA ATC TTA 
Gin Asn Leu Gin Arg Tyr Arg Ser Gin Asp Leu Ala Lys Ala He Leu 
65 70 75 80 

CCA TCT CTT GAC AAC CTT GAG CGT GCA CTT GCA GTT GAA GGT TTG ACA 
Pro Ser Leu Asp Asn Leu Glu Arg Ala Leu. Ala Val Glu Gly Leu Thr 
85 90 95 



GAT GAT GTG AAG AAG GGC TTG GCG ATG GTG CAA GAA AGC TTG ATT CAC 
Asp Asp Val Lys Lys Gly Leu Ala Met Val Gin Glu Ser Leu He His 
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100 105 110 

GCT TTG AAA GAA GAA GGA ATT GAA GAA ATC GCA GCA GAT GGC GAA TTT 384 
Ala Leu Lys Glu Glu Gly lie Glu Glu lie Ala Ala Asp Gly Glu Phe 
5 115 120 125 

GAC CAT AAC TAC CAT ATG GCC ATC CAA ACT CTC CCA GGA GAC GAT GAA 432 

Asp His Asn Tyr His Met Ala lie Gin Thr Leu Pro Gly Asp Asp Glu 

130 135 140 

10 

CAC CCA GTA GAT ACC ATC GCC CAA GTC TTT CAA AAA GGC TAC AAA CTC 480 

His Pro Val Asp Thr lie Ala Gin Val Phe Gin Lys Gly Tyr Lys Leu 

145 150 155 160 

15 CAT GAC CGC ATC CTA CGC CCA GCA ATG GTA GTG GTG TAT AAC TAA 525 
His Asp Arg lie Leu Arg Pro Ala Met Val Val Val Tyr Asn + 
165 170 174 

20 (2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 amino acids 

(B) TYPE: amino acid 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 



30 



Met Ala Gin Asp lie Lys Asn Glu Glu Val Glu Glu Val Gin Glu Glu 
15 10 15 



Glu Val Val Glu Thr Ala Glu Glu Thr Thr Pro Glu Lys Ser Glu Leu 

35 20 25 30 

Asp Leu Ala Asn Glu Arg Ala Asp Glu Phe Glu Asn Lys Tyr Leu Arg 
35 40 45 

40 Ala His Ala Glu Met Gin Asn He Gin Arg Arg Ala Asn Glu Glu Arg 
50 55 60 



45 



Gin Asn Leu Gin Arg Tyr Arg Ser Gin Asp Leu Ala Lys Ala He Leu 
65 70 75 80 

Pro Ser Leu Asp Asn Leu Glu Arg Ala Leu Ala Val Glu Gly Leu Thr 
85 90 95 



Asp Asp Val Lys Lys Gly Leu Ala Met Val Gin Glu Ser Leu He His 
50 100 105 110 

Ala Leu Lys Glu Glu Gly He Glu Glu He Ala Ala Asp Gly Glu Phe 
115 120 125 

55 Asp His Asn Tyr His Met Ala He Gin Thr Leu Pro Gly Asp Asp Glu 
130 135 140 

His Pro Val Asp Thr He Ala Gin Val Phe Gin Lys Gly Tyr Lys Leu 

145 150 155 160 

60 

His Asp Arg He Leu Arg Pro Ala Met Val Val Val Tyr Asn 
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165 170 174 

(2) INFORMATION FOR SEQ ID NO: 109 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 582 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..582 

(D) OTHER INFORMATION: HI1648 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: , 

ATG AAA ATC GGA ATA TTG GCC TTG CAA GGG GCC TTT GCA GAA CAT GCA 48 
Met Lys lie Gly lie Leu Ala Leu Gin Gly Ala Phe Ala Glu His Ala 
1.5 10 15 

AAA GTG CTA GAT CAA TTA GGT GTC GAG AGT GTA GAA CTC AGA AAT CTA 96 
Lys Val Leu Asp Gin Leu Gly Val Glu Ser Val Glu Leu Arg Asn Leu 
20 25 30 

GAT GAT TTT CAG CAA GAT CAG AGT GAC TTG TCG GGT TTG ATT TTG CCT 144 
Asp Asp Phe Gin Gin Asp Gin Ser Asp Leu Ser Gly Leu lie Leu Pro 
35 40 45 

GGT GGT GAG TCT ACA ACC ATG GGC AAG CTC TTA CGT GAC CAG AAC ATG 192 
Gly Gly Glu Ser Thr Thr Met Gly Lys Leu Leu Arg Asp Gin Asn Met 
50 55 60 

CTA CTT CCC ATA CGA GAA GCC ATT CTA TCT GGC TTA CCA GTG TTT GGG 240 
Leu Leu Pro He Arg Glu Ala He Leu Ser Gly Leu Pro Val Phe Gly 
65 70 75 80 

ACC TGT GCG GGC TTA ATT TTG CTG GCT AAG GAA ATC ACT TCT CAG AAA 288 
Thr Cys Ala Gly Leu He Leu Leu Ala Lys Glu He Thr Ser Gin Lys 
85 90 95 

GAG AGT CAT CTA GGA ACT ATG GAT ATG GTG GTC GAG CGT AAT GCT TAT 336 
Glu Ser His Leu Gly Thr Met Asp Met Val Val Glu Arg Asn Ala Tyr 
100 105 110 

GGG CGC CAA TTA GGA AGT TTC TAC ACG GAA GCA GAA TGT AAG GGA GTT 384 
Gly Arg Gin Leu Gly Ser Phe Tyr Thr Glu Ala Glu Cys Lys Gly Val 
115 120 125 

GGC AAG ATT CCA ATG ACC TTT ATC CGT GGT CCG ATT ATC AGT AGT GTT 432 
Gly Lys He Pro Met Thr Phe He Arg Gly Pro He He Ser Ser Val 
130 135 140 
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GGT GAG GGT GTA GAA ATT TTA GCA ATA GTG AAC AAT CAA ATT GTT GCA 480 

Gly Glu Gly Val Glu lie Leu Ala He Val Asn Asn Gin He Val Ala 
145 150 155 160 

5 GCC CAA GAA AAA AAT ATG TTG GTA AGT TCT TTT CAT CCA GAA TTG ACT 528 

Ala Gin Glu Lys Asn Met Leu Val Ser Ser Phe His Pro Glu Leu Thr 
165 170 175 

GAT GAT GTG CGC TTG CAC CAG TAC TTT ATC AAT ATG TGT AAA GAA AAA 576 

10 Asp Asp Val Arg Leu His Gin Tyr Phe He Asn Met Cys Lys Glu Lys 
180 185 190 



AGT TGA 
Ser * 



15 



(2) INFORMATION FOR SEQ ID NO: 110: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 194 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:110: 

Met Lys He Gly He Leu Ala Leu Gin Gly Ala Phe Ala Glu His Ala 
30 1 5 10 15 

Lys Val Leu Asp Gin Leu Gly Val Glu Ser Val Glu Leu Arg Asn Leu 
20 25 30 

35 Asp Asp Phe Gin Gin Asp Gin Ser Asp Leu Ser Gly Leu He Leu Pro 
35 40 45 

Gly Gly Glu Ser Thr Thr Met Gly Lys Leu Leu Arg Asp Gin Asn Met 
50 55 60 

40 

Leu Leu Pro He Arg Glu Ala He Leu Ser Gly Leu Pro Val Phe Gly 
65 70 75 80 

Thr Cys Ala Gly Leu He Leu Leu Ala Lys Glu He Thr Ser Gin Lys 
45 85 90 95 

Glu Ser His Leu Gly Thr Met Asp Met Val Val Glu Arg Asn Ala Tyr 
100 105 110 

50 Gly Arg Gin Leu Gly Ser Phe Tyr Thr Glu Ala Glu Cys Lys Gly Val 
115 120 125 

Gly Lys He Pro Met Thr Phe He Arg Gly Pro He He Ser Ser Val 
130 135 140 

55 

Gly Glu Gly Val Glu He Leu Ala He Val Asn Asn Gin He Val Ala 
145 150 155 160 

Ala Gin Glu Lys Asn Met Leu Val Ser Ser Phe His Pro Glu Leu Thr 
60 165 170 175 



582 
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Asp Asp Val Arg Leu His Gin Tyr Phe He Asn Met Cys Lys Glu Lys 
180 185 190 

Ser 

5 

(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 546 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

20 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..543 

25 (D) OTHER INFORMATION: pgsA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

30 ATG AAA AAA GAA CAA ATT CCC AAT CTC TTA ACA ATA GGT CGA ATT CTC 48 
Met Lys Lys Glu Gin He Pro Asn Leu Leu Thr He Gly Arg He Leu 
15 10 15 

TTT ATA CCT ATT TTT ATC TTT ATT TTA ACG ATA GGA AAT TCG ATA GAG 96 
35 Phe He Pro He Phe He Phe He Leu Thr He Gly Asn Ser He Glu 
20 25 30 

AGT CAT ATA GTT GCA GCT ATT ATC TTT GCT GTT GCC AGT ATT ACC GAC 144 
Ser His He Val Ala Ala He He Phe Ala Val Ala Ser He Thr Asp 
40 35 40 45 

TAT TTA GAT GGA TAT TTA GCT CGT AAA TGG AAT GTG GTC AGT AAT TTT 192 
Tyr Leu Asp Gly Tyr Leu Ala Arg Lys Trp Asn Val Val Ser Asn Phe 
50 55 60 



45 



GGT AAA TTT GCA GAT CCT ATG GCG GAT AAG TTA CTA GTT ATG TCG GCT 240 
Gly Lys Phe Ala Asp Pro Met Ala Asp Lys Leu Leu Val Met Ser Ala 
65 70 75 80 



50 TTT ATT ATG TTG ATT GAG TTA GGT ATG GCT CCG GCT TGG ATT GTT GCA 288 
Phe He Met Leu He Glu Leu Gly Met Ala Pro Ala Trp He Val Ala 
85 90 95 

GTG ATT ATC TGT CGT GAG TTA GCT GTG ACA GGT TTA AGG CTT TTA TTG 336 
55 Val He He Cys Arg Glu Leu Ala Val Thr Gly Leu Arg Leu Leu Leu 
100 105 HO 

GTT GAA ACT GGT GGA ACA ATT TTA GCA GCA GCA ATG CCT GGA AAA ATT 384 
Val Glu Thr Gly Gly Thr He Leu Ala Ala Ala Met Pro Gly Lys He 
60 115 120 125 
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AAA ACT TTT AGT CAG ATG TTT GCT ATT ATT TTC TTG CTA TTA CAT TGG 432 
Lys Thr Phe Ser Gin Met Phe Ala lie lie Phe Leu Leu Leu His Trp 
130 135 140 

5 ACT TTG CTT GGT CAA GTT CTA CTT TAT GTA GCC TTA TTT TTC ACT ATC 480 
Thr Leu Leu Gly Gin Val Leu Leu Tyr Val Ala Leu Phe Phe Thr lie 
145 150 155 160 

TAC TCT GGC TAT GAC TAT TTC AAG GGT AGT GCC TAT GTA TTT AAA GGG 528 
10 Tyr Ser Gly Tyr Asp Tyr Phe Lys Gly Ser Ala Tyr Val Phe Lys Gly 

165 170 175 

ACA TTT GGT TCG AAA TGA 54 6 

Thr Phe Gly Ser Lys 
15 180 

(2) INFORMATION FOR SEQ ID NO: 112: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:112: 

Met Lys Lys Glu Gin lie Pro Asn Leu Leu Thr lie Gly Arg lie Leu 
30 1 5 10 15 

Phe lie Pro lie Phe lie Phe He Leu Thr He Gly Asn Ser He Glu 
20 25 30 

3 5 Ser His He Val Ala Ala He He Phe Ala Val Ala Ser He Thr Asp 
35 40 45 



40 



Tyr Leu Asp Gly Tyr Leu Ala Arg Lys Trp Asn Val Val Ser Asn Phe 
50 55 60 

Gly Lys Phe Ala Asp Pro Met Ala Asp Lys Leu Leu Val Met Ser Ala 
65 70 75 80 



Phe He Met Leu He Glu Leu Gly Met Ala Pro Ala Trp He Val Ala 
45 85 90 95 

Val He He Cys Arg Glu Leu Ala Val Thr Gly Leu Arg Leu Leu Leu 
100 105 110 

50 Val Glu Thr Gly Gly Thr He Leu Ala Ala Ala Met Pro Gly Lys He 
115 120 125 

Lys Thr Phe Ser Gin Met Phe Ala He He Phe Leu Leu Leu His Trp 
130 135 140 

55 

Thr Leu Leu Gly Gin Val Leu Leu Tyr Val Ala Leu Phe Phe Thr He 
145 150 155 160 

Tyr Ser Gly Tyr Asp Tyr Phe Lys Gly Ser Ala Tyr Val Phe Lys Gly 
60 165 170 175 
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Thr Phe Gly Ser Lys 
180 

(2) INFORMATION FOR SEQ ID NO: 113: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1224 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



15 



30 



(ix) FEATURE: 
20 (A) NAME/ KEY: CDS 

(B) LOCATION: 1..1221 

(D) OTHER INFORMATION: RodA 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

ATG AAA CGT TCT CTC GAC TCT AGA GTC GAT TAT AGT TTG CTC TTG CCA 48 
Met Lys Arg Ser Leu Asp Ser Arg Val Asp Tyr Ser Leu Leu Leu Pro 
15 10 15 

GTA TTT TTT CTA CTG GTC ATC GGT GTG GTG GCT ATC TAT ATA GCC GTT 96 
Val Phe Phe Leu Leu Val lie Gly Val Val Ala lie Tyr lie Ala Val 
20 25 30 

35 AGT CAT GAT TAT CCC AAT AAT ATT CTG CCC ATT TTA GGG CAG CAG GTC 144 
Ser His Asp Tyr Pro Asn Asn lie Leu Pro lie Leu Gly Gin Gin Val 
35 40 45 

GCC TGG ATT GCC TTG GGG CTT GTG ATT GGT TTT GTG GTC ATG CTC TTT 192 
40 Ala Trp He Ala Leu Gly Leu Val He Gly Phe Val Val Met Leu Phe 
50 55 60 

AAT ACA GAA TTT CTT TGG AAG GTG ACC CCC TTT CTA TAT ATT TTA GGC 240 
Asn Thr Glu Phe Leu Trp Lys Val Thr Pro Phe Leu Tyr He Leu Gly 
45 65 70 75 80 

TTG GGA CTT ATG ATC TTG CCG ATT GTA TTT TAT AAT CCA AGC TTA GTT 288 
Leu Gly Leu Met He Leu Pro He Val Phe Tyr Asn Pro Ser Leu Val 
85 90 95 

50 

GCA TCA ACG GGT GCC AAA AAC TGG GTA TCA ATA AAT GGA ATT ACC CTA 336 
Ala Ser Thr Gly Ala Lys Asn Trp Val Ser He Asn Gly He Thr Leu 
100 105 110 

55 TTT CAA CCG TCA GAA TTT ATG AAG ATA TCC TAT ATC CTC ATG TTG GCT 384 
Phe Gin Pro Ser Glu Phe Met Lys He Ser Tyr He Leu Met Leu Ala 
115 120 125 

CGT GTC ATT GTC CAA TTT ACA AAG AAA CAT AAG GAA TGG AGA CGC ACG 432 
60 Arg Val He Val Gin Phe Thr Lys Lys His Lys Glu Trp Arg Arg Thr 
130 135 140 
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GTT CCG CTG GAC TTT TTG TTA ATT TTC TGG ATG ATT CTC TTT ACC ATT 460 

Val Pro Leu Asp Phe Leu Leu He Phe Trp Met He Leu Phe Thr He 
145 150 155 160 

5 

CCA GTC CTA GTT CTT TTA GCA CTT CAA AGT GAC TTG GGG ACG GCT TTG 528 

Pro Val Leu Val Leu Leu Ala Leu Gin Ser Asp Leu Gly Thr Ala Leu 
165 170 175 

10 GTT TTT GTA GCC ATT TTC TCA GGA ATC GTT TTA TTA TCA GGG GTT TCT 576 

Val Phe Val Ala He Phe Ser Gly He Val Leu Leu Ser Gly Val Ser 
180 185 190 

TGG AAA ATT ATT ATC CCA GTA TTT GTG ACT GCT GTA ACA GGA GTT GCT 624 

15 Trp Lys He He He Pro Val Phe Val Thr Ala Val Thr Gly Val Ala 

195 200 205 

GGT TTC TTA GCT ATC TTT ATT AGC AAG GAC GGA CGA GCT TTT CTT CAC 672 

Gly Phe Leu Ala He Phe He Ser Lys Asp Gly Arg Ala Phe Leu His 
20 210 215 220 

CAG ATT GGA ATG CCG ACC TAC CAA ATC AAT CGG ATT TTG GCT TGG CTC 720 

Gin He Gly Met Pro Thr Tyr Gin He Asn Arg He Leu Ala Trp Leu 
225 230 235 240 



25 



AAT CCC TTT GAG TTT GCC CAA ACA ACG ACT TAC CAG CAG GCT CAA GGG 768 
Asn Pro Phe Glu Phe Ala Gin Thr Thr Thr Tyr Gin Gin Ala Gin Gly 
245 250 255 



30 CAG ATT GCC ATT GGG AGT GGT GGC TTA TTT GGT CAG GGA TTT AAT GCT 816 
Gin He Ala He Gly Ser Gly Gly Leu Phe Gly Gin Gly Phe Asn Ala 
260 265 270 

TCG AAT CTG CTT ATC CCA GTT CGA GAG TCA GAT ATG ATT TTT ACG GTT 864 
35 Ser Asn Leu Leu He Pro Val Arg Glu Ser Asp Met He Phe Thr Val 
275 280 285 

ATT GCA GAA GAT TTT GGC TTT ATT GGC TCT GTC CTG GTT ATT GCC CTC 912 
He Ala Glu Asp Phe Gly Phe He Gly Ser Val Leu Val He Ala Leu 
40 290 295 300 

TAT CTC ATG TTG ATT TAC CGT ATG TTG AAG ATT ACT CTT AAA TCA AAT 960 

Tyr Leu Met Leu He Tyr Arg Met Leu Lys He Thr Leu Lys Ser Asn 

305 310 315 320 

45 

AAC CAG TTC TAC ACT TAT ATT TCC ACA GGT TTG ATT ATG ATG TTG CTC 1008 

Asn Gin Phe Tyr Thr Tyr He Ser Thr Gly Leu He Met Met Leu Leu 
325 330 335 

50 TTC CAC ATC TTT GAG AAT ATC GGT GCT GTG ACT GGA CTA CTT CCT TTG 1056 
Phe His He Phe Glu Asn He Gly Ala Val Thr Gly Leu Leu Pro Leu 
340 345 350 

ACG GGG ATT CCC TTG CCT TTC ATT TCG CAA GGG GGA TCA GCG ATT ATC 1104 
55 Thr Gly He Pro Leu Pro Phe He Ser Gin Gly Gly Ser Ala He He 
355 360 365 

AGT AAT CTG ATT GGT GTT GGT TTG CTT TTA TCG ATG AGT TAC CAG ACT 1152 
Ser Asn Leu He Gly Val Gly Leu Leu Leu Ser Met Ser Tyr Gin Thr 
60 370 375 380 
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AAT CTA GCT GAA GAA AAG AGC GGA AAA GTC CCA TTC AAA CGG AAA AAG 1200 

Asn Leu Ala Glu Glu Lys Ser Gly Lys Val Pro Phe Lys Arg Lys Lys 

385 390 395 400 

5 GTT GTA TTA AAA CAA ATT AAA TAA 1224 
Val Val Leu Lys Gin lie Lys 
405 

10 (2) INFORMATION FOR SEQ ID NO:114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 407 amino acids 

(B) TYPE: amino acid 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 



20 



Met Lys Arg Ser Leu Asp Ser Arg Val Asp Tyr Ser Leu Leu Leu Pro 
15 10 15 



Val Phe Phe Leu Leu Val He Gly Val Val Ala He Tyr He Ala Val 
25 20 25 30 

Ser His Asp Tyr Pro Asn Asn He Leu Pro He Leu Gly Gin Gin Val 
35 40 45 

30 Ala Trp He Ala Leu Gly Leu Val He Gly Phe Val Val Met Leu Phe 
50 55 60 

Asn Thr Glu Phe Leu Trp Lys Val Thr Pro Phe Leu Tyr lie Leu Gly 
65 70 75 80 

35 

Leu Gly Leu Met He Leu Pro He Val Phe Tyr Asn Pro Ser Leu Val 
85 90 95 

Ala Ser Thr Gly Ala Lys Asn Trp Val Ser He Asn Gly He Thr Leu 
40 100 105 HO 

Phe Gin Pro Ser Glu Phe Met Lys He Ser Tyr He Leu Met Leu Ala 
115 120 125 

45 Arg Val He Val Gin Phe Thr Lys Lys His Lys Glu Trp Arg Arg Thr 
130 135 140 

Val Pro Leu Asp Phe Leu Leu He Phe Trp Met He Leu Phe Thr He 
145 150 155 160 

50 

Pro Val Leu Val Leu Leu Ala Leu Gin Ser Asp Leu Gly Thr Ala Leu 
165 170 175 

Val Phe Val Ala He Phe Ser Gly He Val Leu Leu Ser Gly Val Ser 
55 180 185 190 

Trp Lys He He He Pro Val Phe Val Thr Ala Val Thr Gly Val Ala 
195 200 205 

60 Gly Phe Leu Ala He Phe He Ser Lys Asp Gly Arg Ala Phe Leu His 
210 215 220 
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Gln lie Gly Met Pro Thr Tyr Gin lie Asn Arg lie Leu Ala Trp Leu 
225 230 235 240 

5 Asn Pro Phe Glu Phe Ala Gin Thr Thr Thr Tyr Gin Gin Ala Gin Gly 
245 250 255 

Gin He Ala He Gly Ser Gly Gly Leu Phe Gly Gin Gly Phe Asn Ala 
260 265 270 

10 

Ser Asn Leu Leu He Pro Val Arg Glu Ser Asp Met He Phe Thr Val 
275 280 285 

He Ala Glu Asp Phe Gly Phe He Gly Ser Val Leu Val He Ala Leu 
15 290 295 300 

Tyr Leu Met Leu He Tyr Arg Met Leu Lys He Thr Leu Lys Ser Asn 
305 310 315 320 

20 Asn Gin Phe Tyr Thr Tyr He Ser Thr Gly Leu He Met Met Leu Leu 

325 330 335 

Phe His He Phe Glu Asn He Gly Ala Val Thr Gly Leu Leu Pro Leu 
340 345 350 

25 

Thr Gly He Pro Leu Pro Phe He Ser Gin Gly Gly Ser Ala He He 
355 360 365 

Ser Asn Leu He Gly Val Gly Leu Leu Leu Ser Met Ser Tyr Gin Thr 
30 370 375 380 

Asn Leu Ala Glu Glu Lys Ser Gly Lys Val Pro Phe Lys Arg Lys Lys 
385 390 395 . 400 

35 Val Val Leu Lys Gin He Lys 

405 

(2) INFORMATION FOR SEQ ID NO: 115: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1311 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
50 (iv) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
55 (B) LOCATION: 1..1311 

(D) OTHER INFORMATION: SecY 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 
ATG TTT TTT AAA TTA TTA AGA GAA GCT CTT AAA GTC AAG CAG GTT CGA 



48 
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Met Phe Phe Lys Leu Leu Arg Glu Ala Leu Lys Val Lys Gin Val Arg 
15 10 15 

TCA AAA ATT TTA TTT ACA ATT TTT ATC GTT TTG GTC TTT CGT ATC GGA 96 
Ser Lys He Leu Phe Thr He Phe He Val Leu Val Phe Arg He Gly 
20 25 30 

ACT AGC ATT ACA GTT CCT GGT GTG AAT GCC AAT AGC TTG AAT GCT TTA 144 
Thr Ser He Thr Val Pro Gly Val Asn Ala Asn Ser Leu Asn Ala Leu 
35 40 45 

AGT GGA TTA TCC TTC TTA AAC ATG TTG AGC TTG GTG TCG GGG AAT GCC 192 
Ser Gly Leu Ser Phe Leu Asn Met Leu Ser Leu Val Ser Gly Asn Ala 
50 55 60 

CTA AAA AAC TTT TCG ATT TTT GCC CTA GGA GTT AGT CCC TAT ATC ACC 240 
Leu Lys Asn Phe Ser He Phe Ala Leu Gly Val Ser Pro Tyr He Thr 
65 70 75 80 

GCT TCT ATT GTT GTC CAA CTC TTG CAA ATG GAT ATT TTA CCC AAG TTT 288 
Ala Ser He Val Val Gin Leu Leu Gin Met Asp He Leu Pro Lys Phe 
85 90 95 

GTA GAG TGG GGT AAA CAA GGG GAA GTA GGT CGA AGA AAA TTG AAT CAA 336 
Val Glu Trp Gly Lys Gin Gly Glu Val Gly Arg Arg Lys Leu Asn Gin 
100 105 110 

GCT ACT CGT TAT ATT GCT CTA GTT CTC GCT TTT GTG CAA TCT ATC GGG 384 
Ala Thr Arg Tyr He Ala Leu Val Leu Ala Phe Val Gin Ser He Gly 
115 120 125 

ATT ACA GCT GGT TTT AAT ACC TTG GCT GGA GCT CAA TTG ATT AAA ACT 432 
He Thr Ala Gly Phe Asn Thr Leu Ala Gly Ala Gin Leu He Lys Thr 
130 135 140 

GCT TTA ACT CCA CAA GTT TTT CTG ACG ATT GGT ATC ATC TTA ACA GCT 480 
Ala Leu Thr Pro Gin Val Phe Leu Thr He Gly He He Leu Thr Ala 
145 150 155 160 

GGT AGT ATG ATT GTC ACT TGG TTG GGT GAG CAA ATT ACA GAT AAG GGA 528 
Gly Ser Met He Val Thr Trp Leu Gly Glu Gin He Thr Asp Lys Gly 
165 170 175 

TAC GGA AAC GGT GTT TCC ATG ATT ATC TTT GCC GGG ATT GTT TCC TCA 576 
Tyr Gly Asn Gly Val Ser Met He He Phe Ala Gly He Val Ser Ser 
180 185 190 

ATT CCA GAG ATG ATT CAG GGC ATC TAT GTG GAC TAC TTT GTG AAC GTC 624 
He Pro Glu Met He Gin Gly lie Tyr Val Asp Tyr Phe Val Asn Val 
195 200 205 

CCA AGT AGC CGT ATC ACT TCA TCT ATC ATT TTC GTA ATC ATT TTG ATT 672 
Pro Ser Ser Arg He Thr Ser Ser He He Phe Val He He Leu He 
210 215 220 

ATT ACT GTA TTG TTG ATT ATT TAC TTT ACA ACT TAT GTT CAA CAA GCA 720 
He Thr Val Leu Leu He He Tyr Phe Thr Thr Tyr Val Gin Gin Ala 
225 230 235 240 

GAA TAC AAA ATT CCA ATC CAA TAT ACT AAG GTT GCA CAA GGT GCT CCA 768 
Glu Tyr Lys He Pro He Gin Tyr Thr Lys Val Ala Gin Gly Ala Pro 



10 



30 
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245 250 255 

TCT AGC TCT TAC CTT CCG TTA AAG GTA AAT CCT GCT GGA GTT ATC CCT 816 

Ser Ser Ser Tyr Leu Pro Leu Lys Val Asn Pro Ala Gly Val lie Pro 

260 265 270 

GTT ATC TTT GCC AGT TCG ATT ACT GCA GCG CCT GCG GCT ATT CTT CAG 864 

Val He Phe Ala Ser Ser He Thr Ala Ala Pro Ala Ala He Leu Gin 
275 280 285 

TTT TTG AGT GCC ACA GGT CAT GAT TGG GCT TGG GTA AGG GTA GCA CAA 912 

Phe Leu Ser Ala Thr Gly His Asp Trp Ala Trp Val Arg Val Ala Gin 
290 295 300 



15 GAG ATG TTG GCA ACT ACT TCT CCA ACT GGT ATT GCC ATG TAT GCT TTG 960 
Glu Met Leu Ala Thr Thr Ser Pro Thr Gly He Ala Met Tyr Ala Leu 
305 310 315 320 

TTG ATT ATT CTC TTT ACA TTC TTC TAT ACG TTT GTA CAG ATT AAT CCT 1008 
20 Leu He He Leu Phe Thr Phe Phe Tyr Thr Phe Val Gin He Asn Pro 

325 330 335 

GAA AAA GCA GCA GAG AGC CTA CAA AAG AGT GGT GCC TAT ATC CAT GGA 1056 
Glu Lys Ala Ala Glu Ser Leu Gin Lys Ser Gly /Via Tyr He His Gly 
25 340 345 350 

GTT CGT CCT GGT AAA GGT ACA GAA GAA TAT ATG TCT AAA CTT CTT CGT 1104 
Val Arg Pro Gly Lys Gly Thr Glu Glu Tyr Met Ser Lys Leu Leu Arg 
355 360 365 



CGT CTT GCA ACT GTT GGT TCC CTC TTC CTT GGT GTG ATT TCC ATT TTA 1152 
Arg Leu Ala Thr Val Gly Ser Leu Phe Leu Gly Val He Ser He Leu 
370 375 380 



35 CCG ATT GCA GCT AAA GAT GTA TTT GGT CTT TCT GAT GTT GTT GCC TTT 1200 
Pro He Ala Ala Lys Asp Val Phe Gly Leu Ser Asp Val Val Ala Phe 
385 390 395 400 

GGT GGA ACA AGT CTC TTG ATC ATT ATC TCT ACA GGT ATC GAA GGA ATC 1248 
40 Gly Gly Thr Ser Leu Leu He He He Ser Thr Gly He Glu Gly He 

405 410 415 

AAG CAA TTG GAA GGT TAC CTA TTG AAA CGT AAG TAT GTT GGT TTC ATG 1296 
Lys Gin Leu Glu Gly Tyr Leu Leu Lys Arg Lys Tyr Val Gly Phe Met 
45 420 425 430 

GAC AGA ACA GAA TAA 1311 
Asp Arg Thr Glu * 
435 

50 

(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 437 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



60 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 
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Met Phe Phe Lys Leu Leu Arg Glu Ala Leu Lys Val Lys Gin Val Arg 
15 10 15 

5 Ser Lys He Leu Phe Thr He Phe He Val Leu Val Phe Arg He Gly 
20 25 30 

Thr Ser He Thr Val Pro Gly Val Asn Ala Asn Ser Leu Asn Ala Leu 
35 40 45 

10 

Ser Gly Leu Ser Phe Leu Asn Met Leu Ser Leu Val Ser Gly Asn Ala 
50 55 60 

Leu Lys Asn Phe Ser He Phe Ala Leu Gly Val Ser Pro Tyr He Thr 
15 65 70 75 80 

Ala Ser He Val Val Gin Leu Leu Gin Met Asp He Leu Pro Lys Phe 
85 90 95 

20 Val Glu Trp Gly Lys Gin Gly Glu Val Gly Arg Arg Lys Leu Asn Gin 
100 105 110 

Ala Thr Arg Tyr He Ala Leu Val Leu Ala Phe Val Gin Ser He Gly 
115 120 125 

25 

lie Thr Ala Gly Phe Asn Thr Leu Ala Gly Ala Gin Leu lie Lys Thr 
130 135 140 

Ala Leu Thr Pro Gin Val Phe Leu Thr lie Gly lie lie Leu Thr Ala 
30 145 150 155 160 

Gly Ser Met lie Val Thr Trp Leu Gly Glu Gin lie Thr Asp Lys Gly 
165 170 175 

35 Tyr Gly Asn Gly Val Ser Met He lie Phe Ala Gly lie Val Ser Ser 
180 185 190 

lie Pro Glu Met lie Gin Gly lie Tyr Val Asp Tyr Phe Val Asn Val 
195 200 205 

40 

Pro Ser Ser Arg lie Thr Ser Ser lie lie Phe Val lie lie Leu lie 
210 215 220 

lie Thr Val Leu Leu He lie Tyr Phe Thr Thr Tyr Val Gin Gin Ala 
45 225 230 235 240 

Glu Tyr Lys lie Pro lie Gin Tyr Thr Lys Val Ala Gin Gly Ala Pro 
245 250 255 

50 Ser Ser Ser Tyr Leu Pro Leu Lys Val Asn Pro Ala Gly Val lie Pro 
260 265 270 

Val He Phe Ala Ser Ser lie Thr Ala Ala Pro Ala Ala He Leu Gin 
275 280 285 

55 

Phe Leu Ser Ala Thr Gly His Asp Trp Ala Trp Val Arg Val Ala Gin 
290 295 300 

Glu Met Leu Ala Thr Thr Ser Pro Thr Gly lie Ala Met Tyr Ala Leu 
60 305 310 315 320 
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Leu He He Leu Phe Thr Phe 
325 

Glu Lys Ala Ala Glu Ser Leu 
340 

Val Arg Pro Gly Lys Gly Thr 
355 

Arg Leu Ala Thr Val Gly Ser 
370 375 

Pro He Ala Ala Lys Asp Val 
385 390 

Gly Gly Thr Ser Leu Leu He 
405 

Lys Gin Leu Glu Gly Tyr Leu 
420 

Asp Arg Thr Glu 
435 

(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1956 

(D) OTHER INFORMATION: FtsH 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

ATG AAA AAA CAA AAT AAT GGT TTA ATT AAA AAT CCT TTT CTA TGG TTA 48 
Met Lys Lys Gin Asn Asn Gly Leu He Lys Asn Pro Phe Leu Trp Leu 
15 10 15 

TTA TTT ATC TTT TTC CTT GTG ACA GGA TTC CAG TAT TTC TAT TCT GGG 96 
Leu Phe He Phe Phe Leu Val Thr Gly Phe Gin Tyr Phe Tyr Ser Gly 
20 25 30 

AAT AAC TCA GGA GGA AGT CAG CAA ATC AAC TAT ACT GAG TTG GTA CAA 144 
Asn Asn Ser Gly Gly Ser Gin Gin He Asn Tyr Thr Glu Leu Val Gin 
35 40 45 

GAA ATT ACC GAT GGT AAT GAA AAA GAA TTA ACT TAC CAA CCA AAT GTT 192 
Glu He Thr Asp Gly Asn Glu Lys Glu Leu Thr Tyr Gin Pro Asn Val 



Phe Tyr Thr Phe Val Gin He Asn Pro 
330 335 

Gin Lys Ser Gly Ala Tyr lie His Gly 
345 350 

Glu Glu Tyr Met Ser Lys Leu Leu Arg 
360 365 

Leu Phe Leu Gly Val He Ser He Leu 
380 

Phe Gly Leu Ser Asp Val Val Ala Phe 
395 400 

He He Ser Thr Gly He Glu Gly He 
410 415 



Leu Lys Arg Lys Tyr Val Gly Phe Met 
425 430 
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50 



55 



60 



AGT GTT ATC GAA GTT TCT GGT GTC TAT AAA AAT CCT AAA ACA" AGT AAA 
Ser Val lie Glu Val Ser Gly Val Tyr Lys Asn Pro Lys Thr Ser Lys 
65 70 75 80 

GAA GGA ACA GGT ATT CAG TTT TTC ACG CCA TCT GTT ACT AAG GTA GAG 
Glu Gly Thr Gly lie Gin Phe Phe Thr Pro Ser Val Thr Lys Val Glu 
85 90 95 

AAA TTT ACC AGC ACT ATT CTT CCT GCA GAT ACT ACC GTA TCA GAA TTG 
Lys Phe Thr Ser Thr lie Leu Pro Ala Asp Thr Thr Val Ser Glu Leu 
100 105 HO 

CAA AAA CTT GCT ACT GAC CAT AAA GCA GAA GTA ACT GTT AAG CAT GAA 
Gin Lys Leu Ala Thr Asp His Lys Ala Glu Val Thr Val Lys His Glu 
115 120 125 

AGT TCA AGT GGT ATA TGG ATT AAT CTA CTC GTA TCC ATT GTG CCA TTT 
Ser Ser Ser Gly He Trp He Asn Leu Leu Val Ser He Val Pro Phe 
130 135 140 

GGA ATT CTA TTC TTC TTC CTA TTC TCT ATG ATG GGA AAT ATG GGA GGA 
Gly He Leu Phe Phe Phe Leu Phe Ser Met Met Gly Asn Met Gly Gly 
145 150 155 160 

GGC AAT GGC CGT AAT CCA ATG AGT TTT GGA CGT AGT AAG GCT AAA GCA 
Gly Asn Gly Arg Asn Pro Met Ser Phe Gly Arg Ser Lys Ala Lys Ala 
165 170 175 

GCA AAT AAA GAA GAT ATT AAA GTA AGA TTT TCA GAT GTT GCT GGA GCT 
Ala Asn Lys Glu Asp He Lys Val Arg Phe Ser Asp Val Ala Gly Ala 
180 185 190 

GAG GAA GAA AAA CAA GAA CTA GTT GAA GTT GTT GAG TTC TTA AAA GAT 
Glu Glu Glu Lys Gin Glu Leu Val Glu Val Val Glu Phe Leu Lys Asp 
195 200 205 

CCA AAA CGA TTC ACA AAA CTT GGA GCC CGT ATT CCA GCA GGT GTT CTT 
Pro Lys Arg Phe Thr Lys Leu Gly Ala Arg He Pro Ala Gly Val Leu 
210 215 220 

TTG GAG GGA CCT CCG GGG ACA GGT AAG ACT TTG CTT GCT AAG GCA GTC 
Leu Glu Gly Pro Pro Gly Thr Gly Lys Thr Leu Leu Ala Lys Ala Val 
225 230 235 240 

GCT GGA GAA GCA GGT GTT CCA TTC TTT AGT ATC TCA GGT TCT GAC TTT 
Ala Gly Glu Ala Gly Val Pro Phe Phe Ser He Ser Gly Ser Asp Phe 
245 250 255 

GTA GAA ATG TTT GTC GGA GTT GGA GCT AGT CGT GTT CGC TCT CTT TTT 
Val Glu Met Phe Val Gly Val Gly Ala Ser Arg Val Arg Ser Leu Phe 
260 265 270 

GAG GAT GCC AAA AAA GCA GCA CCA GCT ATC ATC TTT ATC GAT CTA AAT 
Glu Asp Ala Lys Lys Ala Ala Pro Ala He He Phe He Asp Leu Asn 
275 280 285 

GAT GCT GTT GGA CGT CAA CGT GGA GTC GGT CTC GGC GGA GGT AAT GAC 
Asp Ala Val Gly Arg Gin Arg Gly Val Gly Leu Gly Gly Gly Asn Asp 
290 295 300 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 



864 



912 
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GAA CGT GAA CAA ACC TTG AAC CAA CTT TTG ATT GAG ATG GAT GGT TTT 960 

Glu Arg Glu Gin Thr Leu Asn Gin Leu Leu He Glu Met Asp Gly Phe 
305 310 315 320 

5 

GAG GGA AAT GAA GGG ATT ATC GTC ATC GCT GCG ACA AAC CGT TCA GAT 1008 

Glu Gly Asn Glu Gly He He Val He Ala Ala Thr Asn Arg Ser Asp 
325 330 335 

10 GTA CTT GAT CCT GCC CTT TTG CGT CCA GGA CGT TTT GAT AGA AAA GTA 1056 
Val Leu Asp Pro Ala Leu Leu Arg Pro Gly Arg Phe Asp Arg Lys Val 
340 345 350 

TTG GTT GGC CGT CCT GAT GTT AAA GGT CGT GAA GCA ATC TTG AAA GTT 1104 
15 Leu Val Gly Arg Pro Asp Val Lys Gly Arg Glu Ala He Leu Lys Val 
355 360 365 

CAC GCT AAG AAC AAG CCT TTA GCA GAA GAT GTT GAT TTG AAA TTA GTG 1152 
His Ala Lys Asn Lys Pro Leu Ala Glu Asp Val Asp Leu Lys Leu Val 
20 370 375 380 

GCT CAA CAA ACT CCA GGC TTT GTT GGT GCT GAT TTA GAG AAT GTC TTG 1200 
Ala Gin Gin Thr Pro Gly Phe Val Gly Ala Asp Leu Glu Asn Val Leu 
385 390 395 400 



25 



45 



AAT GAA GCA GCT TTA GTT GCT GCT CGT CGC AAT AAA TCG ATA ATT GAT 1248 
Asn Glu Ala Ala Leu Val Ala Ala Arg Arg Asn Lys Ser He He Asp 
405 410 415 



30 GCT TCA GAT ATT GAT GAA GCA GAA GAT AGA GTT ATT GCT GGA CCT TCT 1296 
Ala Ser Asp He Asp Glu Ala Glu Asp Arg Val He Ala Gly Pro Ser 
420 425 430 

AAG AAA GAT AAG ACA GTT TCA CAA AAA GAA CGA GAA TTG GTT GCT TAC 1344 
35 Lys Lys Asp Lys Thr Val Ser Gin Lys Glu Arg Glu Leu Val Ala Tyr 
435 440 445 

CAT GAG GCA GGA CAT ACC ATT GTT GGT CTA GTC TTG TCG ACT GCT CGC 1392 
His Glu Ala Gly His Thr He Val Gly Leu Val Leu Ser Thr Ala Arg 
40 450 455 460 

GTT GTC CAT AAG GTT ACA ATT GTA CCA CGC GGC CGT GCA GGC GGA TAC 1440 
Val Val His Lys Val Thr He Val Pro Arg Gly Arg Ala Gly Gly Tyr 
465 470 475 480 



ATG ATT GCA CTT CCT AAA GAG GAT CAA ATG CTT CTA TCT AAA GAA GAT 1488 
Met He Ala Leu Pro Lys Glu Asp Gin Met Leu Leu Ser Lys Glu Asp 
485 490 495 



50 ATG AAA GAG CAA TTG GCT GGC TTA ATG GGT GGA CGT GTA GCT GAA GAA 1536 

Met Lys Glu Gin Leu Ala Gly Leu Met Gly Gly Arg Val Ala Glu Glu 
500 505 510 

ATT ATC TTT AAT GTC CAA ACT ACA GGA GCT TCA AAC GAC TTT GAA CAA 1584 

55 He He Phe Asn Val Gin Thr Thr Gly Ala Ser Asn Asp Phe Glu Gin 
515 520 525 

GCG ACA CAA ATG GCA CGT GCA ATG GTT ACA GAG TAC GGT ATG AGT GAA 1632 

Ala Thr Gin Met Ala Arg Ala Met Val Thr Glu Tyr Gly Met Ser Glu 

60 530 535 540 
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AAA CTT GGC CCA GTA CAA TAT GAA GGA AAC CAT GCT ATG CTT GGT GCA 1680 
Lys Leu Gly Pro Val Gin Tyr Glu Gly Asn His Ala Met Leu Gly Ala 
545 550 555 560 

5 CAG AGT CCT CAA AAA TCA ATT TCA GAA CAA ACA GCT TAT GAA ATT GAT 1728 
Gin Ser Pro Gin Lys Ser He Ser Glu Gin Thr Ala Tyr Glu He Asp 
565 570 575 

GAA GAG GTT CGT TCA TTA TTA AAT GAG GCA CGA AAT AAA GCT GCT GAA 1776 
10 Glu Glu Val Arg Ser Leu Leu Asn Glu Ala Arg Asn Lys Ala Ala Glu 
580 585 590 

ATT ATT CAG TCA AAT CGT GAA ACT CAC AAG TTA ATT GCA GAA GCA TTA 1824 
He He Gin Ser Asn Arg Glu Thr His Lys Leu He Ala Glu Ala Leu 
15 595 600 605 

TTG AAA TAC GAA ACA TTG GAT AGT ACA CAA ATT AAA GCT CTT TAC GAA 1872 
Leu Lys Tyr Glu Thr Leu Asp Ser Thr Gin He Lys Ala Leu Tyr Glu 
610 615 620 



20 



40 



55 



ACA GGA AAG ATG CCT GAA GCA GTA GAA GAG GAA TCT CAT GCA CTA TCC 1920 
Thr Gly Lys Met Pro Glu Ala Val Glu Glu Glu Ser His Ala Leu Ser 
625 630 635 640 



25 TAT GAT GAA GTA AAG TCA AAA ATG AAT GAC GAA AAA TAA 1959 
Tyr Asp Glu Val Lys Ser Lys Met Asn Asp Glu Lys 
645 650 

30 (2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Met Lys Lys Gin Asn Asn Gly Leu He Lys Asn Pro Phe Leu Trp Leu 
15 10 15 



Leu Phe He Phe Phe Leu Val Thr Gly Phe Gin Tyr Phe Tyr Ser Gly 
45 20 25 30 

Asn Asn Ser Gly Gly Ser Gin Gin He Asn Tyr Thr Glu Leu Val Gin 
35 40 45 

50 Glu He Thr Asp Gly Asn Glu Lys Glu Leu Thr Tyr Gin Pro Asn Val 
50 55 60 



Ser Val He Glu Val Ser Gly Val Tyr Lys Asn Pro Lys Thr Ser Lys 
65 70 75 80 

Glu Gly Thr Gly He Gin Phe Phe Thr Pro Ser Val Thr Lys Val Glu 
85 90 95 



Lys Phe Thr Ser Thr He Leu Pro Ala Asp Thr Thr Val Ser Glu Leu 
60 100 105 HO 
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Gln Lys Leu Ala Thr Asp His Lys Ala Glu Val Thr Val Lys His Glu 
115 120 125 

Ser Ser Ser Gly lie Trp He Asn Leu Leu Val Ser He Val Pro Phe 
5 130 135 140 

Gly He Leu Phe Phe Phe Leu Phe Ser Met Met Gly Asn Met Gly Gly 
145 150 155 160 

10 Gly Asn Gly Arg Asn Pro Met Ser Phe Gly Arg Ser Lys Ala Lys Ala 

165 170 175 

Ala Asn Lys Glu Asp He Lys Val Arg Phe Ser Asp Val Ala Gly Ala 
180 185 190 

15 

Glu Glu Glu Lys Gin Glu Leu Val Glu Val Val Glu Phe Leu Lys Asp 
195 200 205 

Pro Lys Arg Phe Thr Lys Leu Gly Ala Arg He Pro Ala Gly Val Leu 
20 210 . 215 220 

Leu Glu Gly Pro Pro Gly Thr Gly Lys Thr Leu Leu Ala Lys Ala Val 
225 230 235 240 

25 Ala Gly Glu Ala Gly Val Pro Phe Phe Ser He Ser Gly Ser Asp Phe 

245 250 255 

Val Glu Met Phe Val Gly Val Gly Ala Ser Arg Val Arg Ser Leu Phe 
260 265 270 

30 

Glu Asp Ala Lys Lys Ala Ala Pro Ala He He Phe He Asp Leu- Asn 
275 280 285 

Asp Ala Val Gly Arg Gin Arg Gly Val Gly Leu Gly Gly Gly Asn Asp 
35 290 295 300 

Glu Arg Glu Gin Thr Leu Asn Gin Leu Leu He Glu Met Asp Gly Phe 
305 310 315 320 

40 Glu Gly Asn Glu Gly He He Val He Ala Ala Thr Asn Arg Ser Asp 

325 330 335 

Val Leu Asp Pro Ala Leu Leu Arg Pro Gly Arg Phe Asp Arg Lys Val 
340 345 350 

45 

Leu Val Gly Arg Pro Asp Val Lys Gly Arg Glu Ala He Leu Lys Val 
355 360 365 

His Ala Lys Asn Lys Pro Leu Ala Glu Asp Val Asp Leu Lys Leu Val 
50 370 375 380 

Ala Gin Gin Thr Pro Gly Phe Val Gly Ala Asp Leu Glu Asn Val Leu 
385 390 395 400 

55 Asn Glu Ala Ala Leu Val Ala Ala Arg Arg Asn Lys Ser He He Asp 

405 410 415 

Ala Ser Asp He Asp Glu Ala Glu Asp Arg Val He Ala Gly Pro Ser 
420 425 430 

60 

Lys Lys Asp Lys Thr Val Ser Gin Lys Glu Arg Glu Leu Val Ala Tyr 
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435 440 445 

His Glu Ala Gly His Thr lie Val Gly Leu Val Leu Ser Thr Ala Arg 
450 455 460 

Val Val His Lys Val Thr He Val Pro Arg Gly Arg Ala Gly Gly Tyr 
465 470 475 480 

Met He Ala Leu Pro Lys Glu Asp Gin Met Leu Leu Ser Lys Glu Asp 
485 490 495 

Met Lys Glu Gin Leu Ala Gly Leu Met Gly Gly Arg Val Ala Glu Glu 
500 505 510 

He He Phe Asn Val Gin Thr Thr Gly Ala Ser Asn Asp Phe Glu Gin 
515 520 525 

Ala Thr Gin Met Ala Arg Ala Met Val Thr Glu Tyr Gly Met Ser Glu 
530 535 540 

Lys Leu Gly Pro Val Gin Tyr Glu Gly Asn His Ala Met Leu Gly Ala 
545 550 555 560 

Gin Ser Pro Gin Lys Ser He Ser Glu Gin Thr Ala Tyr Glu He Asp 
565 570 575 

Glu Glu Val Arg Ser Leu Leu Asn Glu Ala Arg Asn Lys Ala Ala Glu 
580 585 590 

He He Gin Ser Asn Arg Glu Thr His Lys Leu He Ala Glu Ala Leu 
595 600 605 

Leu Lys Tyr Glu Thr Leu Asp Ser Thr Gin He Lys Ala Leu Tyr Glu 
610 615 620 

Thr Gly Lys Met Pro Glu Ala Val Glu Glu Glu Ser His Ala Leu Ser 
625 630 635 640 

Tyr Asp Glu Val Lys Ser Lys Met Asn Asp Glu Lys 
645 650 



(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1278 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..1278 

(D) OTHER INFORMATION: FtsY 
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10 



30 



50 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

ATG GGA TTG TTT GAC CGT CTA TTC GGA AAA AAA GAA GAA CCT AAA ATC 4 8 

Met Gly Leu Phe Asp Arg Leu Phe Gly Lys Lys Glu Glu Pro Lys lie 
15 10 15 

GAA GAA GTT GTA AAA GAA GCT CTG GAA AAT CTT GAT TTG TCT GAA GAT 96 
Glu Glu Val Val Lys Glu Ala Leu Glu Asn Leu Asp Leu Ser Glu Asp 
20 25 30 

GTT GAT CCT ACC TTC ACA GAA GTT GAG GAA GTT TCT CAG GAA GAA GCA 144 
Val Asp Pro Thr Phe Thr Glu Val Glu Glu Val Ser Gin Glu Glu Ala 
35 40 45 



15 GAG GTT GAA ATT GTT GAA CAA GCT GTG TTC CAA GAA GAG GAA ATC CAA 192 
Glu Val Glu He Val Glu Gin Ala Val Phe Gin Glu Glu Glu He Gin 
50 55 60 

GAC ACA GTT GAA GAA AGT CTG GAT TTA GAG CCA GTT GTA GAA GTT TCT 240 
20 Asp Thr Val Glu Glu Ser Leu Asp Leu Glu Pro Val Val Glu Val Ser 
65 70 75 80 

CAA AAA GAA GTC GAA GAA TTT CCA CAC TCA GAA GAA GGG AAT ACT GAG 288 
Gin Lys Glu Val Glu Glu Phe Pro His Ser Glu Glu Gly Asn Thr Glu 
25 85 90 95 

TTT CTA GAG ACT ATA GAA GAA AAT AAT TCT GAA GTT CTT GAA CCA GAA 336 
Phe Leu Glu Thr He Glu Glu Asn Asn Ser Glu Val Leu Glu Pro Glu 
100 105 HO 



AGG CCT CAA GCA GAA GAA ACC GTT CAG GAA AAA TAT GAC CGC AGT CTT 384 
Arg Pro Gin Ala Glu Glu Thr Val Gin Glu Lys Tyr Asp Arg Ser Leu 
115 120 125 



35 AAG AAA ACT CGT ACA GGT TTC GGT GCC CGC TTG AAT GCC TTC TTT GCT 432 
Lys Lys Thr Arg Thr Gly Phe Gly Ala Arg Leu Asn Ala Phe Phe Ala 
130 135 140 



AAC TTC CGC TCT GTT GAC GAA GAA TTT TTC GAG GAA CTG GAA GAA CTG 
40 Asn Phe Arg Ser Val Asp Glu Glu Phe Phe Glu Glu Leu Glu Glu Leu 
145 150 155 160 



480 



CTG ATT ATG AGT GAT GTT GGT GTC CAA GTC GCT TCT AAC TTA ACG GAG 528 
Leu He Met Ser Asp Val Gly Val Gin Val Ala Ser Asn Leu Thr Glu 
45 165 170 175 



GAA CTA CGT TAC GAA GCC AAG CTT GAA AAT GCC AAG AAA CCT GAT GCA 576 
Glu Leu Arg Tyr Glu Ala Lys Leu Glu Asn Ala Lys Lys Pro Asp Ala 
180 185 190 

CTT CGT CGT GTC ATC ATT GAG AAA TTG GTT GAG CTT TAT GAA AAG GAT 624 
Leu Arg Arg Val He He Glu Lys Leu Val Glu Leu Tyr Glu Lys Asp 
195 200 205 



55 GGT AGC TAC GAT GAA AGC ATC CAC TTC CAA GAT AAC TTG ACA GTT ATG 672 

Gly Ser Tyr Asp Glu Ser He His Phe Gin Asp Asn Leu Thr Val Met 
210 215 220 

CTC TTT GTT GGT GTG AAT GGT GTT GGG AAA ACA ACT TCT ATC GGA AAA 720 

60 Leu Phe Val Gly Val Asn Gly Val Gly Lys Thr Thr Ser He Gly Lys 
225 230 235 240 
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CTA GCC CAC CGC TAC AAA CAA GCT GGT AAG AAG GTC ATG CTG GTT GCA 768 
Leu Ala His Arg Tyr Lys Gin Ala Gly Lys Lys Val Met Leu Val Ala 
245 250 255 

5 

GCA GAT ACC TTC CGT GCG GGT GCA GTA GCT CAG CTA GCT GAA TGG GGC 816 
Ala Asp Thr Phe Arg Ala Gly Ala Val Ala Gin Leu Ala Glu Trp Gly 
260 265 270 

10 CGA CGA GTA GAT GTT CCA GTA GTA ACT GGA CCT GAA AAA GCT GAT CCA 864 
Arg Arg Val Asp Val Pro Val Val Thr Gly Pro Glu Lys Ala Asp Pro 
275 280 285 

GCC AGC GTG GTC TTT GAT GGT ATG GAA CGT GCC GTG GCT GAA GGT ATC 912 
15 Ala Ser Val Val Phe Asp Gly Met Glu Arg Ala Val Ala Glu Gly He . 
290 295 300 

GAT ATT CTC ATG ATT GAT ACT GCT GGT CGT CTG CAA AAT AAG GAT AAC 960 
Asp He Leu Met He Asp Thr Ala Gly Arg Leu Gin Asn Lys Asp Asn 
20 305 310 315 320 

CTT ATG GCT GAG TTG GAA AAG ATT GGT CGT ATT ATC AAA CGT GTT GTG 1008 
Leu Met Ala Glu Leu Glu Lys He Gly Arg He He Lys Arg Val Val 
325 330 335 



25 



45 



50 



CCA GAA GCA CCA CAT GAA ACC TTC TTG GCA CTT GAT GCA TCA ACA GGT 1056 
Pro Glu Ala Pro His Glu Thr Phe Leu Ala Leu Asp Ala Ser Thr Gly 
340 345 350 



30 CAA AAT GCC CTA GTA CAG GCC AAA GAA TTT TCG AAA ATC ACA CCT TTA 1104 
Gin Asn Ala Leu Val Gin Ala Lys Glu Phe Ser Lys He Thr Pro Leu 
355 360 365 

ACG GGA ATT GTT TTG ACT AAG ATT GAT GGA ACT GCT CGA GGA GGT GTG 1152 
35 Thr Gly He Val Leu Thr Lys He Asp Gly Thr Ala Arg Gly Gly Val 
370 375 380 

GTT CTA GCC ATT CGT GAA GAA CTC AAT ATT CCT GTA AAA TTG ATT GGT 1200 
Val Leu Ala He Arg Glu Glu Leu Asn He Pro Val Lys Leu He Gly 
40 385 390 395 400 

TTT GGT GAA AAA ATC GAT GAT ATT GGA GAG TTT AAC TCA GAA AAC TTT 1248 
Phe Gly Glu Lys He Asp Asp He Gly Glu Phe Asn Ser Glu Asn Phe 
405 410 415 



ATG AAA GGT CTC TTG GAA GGT TTA ATC TAA 1278 
Met Lys Gly Leu Leu Glu Gly Leu He * 
420 425 



(2) INFORMATION FOR SEQ ID NO: 120: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 425 amino acids 
55 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 
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06- 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



Met Gly Leu Phe Asp Arg Leu Phe Gly Lys Lys Glu Glu Pro Lys lie 
15 10 15 

Glu Glu Val Val Lys Glu Ala Leu Glu Asn Leu Asp Leu Ser Glu Asp 
20 25 30 

Val Asp Pro Thr Phe Thr Glu Val Glu Glu Val Ser Gin Glu Glu Ala 
35 40 45 

Glu Val Glu He Val Glu Gin Ala Val Phe Gin Glu Glu Glu He Gin 
50 55 60 

Asp Thr Val Glu Glu Ser Leu Asp Leu Glu Pro Val Val Glu Val Ser 
65 70 75 80 

Gin Lys Glu Val Glu Glu Phe Pro His Ser Glu Glu Gly Asn Thr Glu 
85 90 95 

Phe Leu Glu Thr He Glu Glu Asn Asn Ser Glu Val Leu Glu Pro Glu 
100 105 110 

Arg Pro Gin Ala Glu Glu Thr Val Gin Glu Lys Tyr Asp Arg Ser Leu 
115 120 125 

Lys Lys Thr Arg Thr Gly Phe Gly Ala Arg Leu Asn Ala Phe Phe Ala 
130 135 140 

Asn Phe Arg Ser Val Asp Glu Glu Phe Phe Glu Glu Leu Glu Glu Leu 
145 150 155 160 

Leu He Met Ser Asp Val Gly Val Gin Val Ala Ser Asn Leu Thr Glu 
165 170 175 

Glu Leu Arg Tyr Glu Ala Lys Leu Glu Asn Ala Lys Lys Pro Asp Ala 
180 185 190 

Leu Arg Arg Val He He Glu Lys Leu Val Glu Leu Tyr Glu Lys Asp 
195 200 205 

Gly Ser Tyr Asp Glu Ser He His Phe Gin Asp Asn Leu Thr Val Met 
210 215 220 

Leu Phe Val Gly Val Asn Gly Val Gly Lys Thr Thr Ser He Gly Lys 
225 230 235 240 

Leu Ala His Arg Tyr Lys Gin Ala Gly Lys Lys Val Met Leu Val Ala 
245 250 255 

Ala Asp Thr Phe Arg Ala Gly Ala Val Ala Gin Leu Ala Glu Trp Gly 
260 265 270 

Arg Arg Val Asp Val Pro Val Val Thr Gly Pro Glu Lys Ala Asp Pro 
275 280 285 

Ala Ser Val Val Phe Asp Gly Met Glu Arg Ala Val Ala Glu Gly He 
290 295 300 

Asp He Leu Met He Asp Thr Ala Gly Arg Leu Gin Asn Lys Asp Asn 
305 310 315 320 

Leu Met Ala Glu Leu Glu Lys He Gly Arg He He Lys Arg Val Val 
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325 330 335 

Pro Glu Ala Pro His Glu Thr Phe Leu Ala Leu Asp Ala Ser Thr Gly 
340 345 350 

Gin Asn Ala Leu Val Gin Ala Lys Glu Phe Ser Lys He Thr Pro Leu 
355 360 365 

Thr Gly He Val Leu Thr Lys He Asp Gly Thr Ala Arg Gly Gly Val 
370 375 380 

Val Leu Ala He Arg Glu Glu Leu Asn He Pro Val Lys Leu He Gly 
385 390 395 400 

Phe Gly Glu Lys He Asp Asp He Gly Glu Phe Asn Ser Glu Asn Phe 
405 410 415 

Met Lys Gly Leu Leu Glu Gly Leu He 
420 425 

(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 891 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..891 

(D) OTHER INFORMATION: HI1146 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

ATG ACA AAG AAA CAA CTT CAC TTG GTG ATT GTG ACA GGG ATG GGT GGC 48 
Met Thr Lys Lys Gin Leu His Leu Val He Val Thr Gly Met Gly Gly 
15 10 15 

GCA GGG AAA ACT GTA GCC ATT CAG TCC TTC GAG GAT CTA GGT TAT TTC 96 
Ala Gly Lys Thr Val Ala He Gin Ser Phe Glu Asp Leu Gly Tyr Phe 
20 25 30 

ACC ATT GAT AAT ATG CCG CCA GCT CTC TTG CCT AAG TTT TTG CAG CTG 144 
Thr He Asp Asn Met Pro Pro Ala Leu Leu Pro Lys Phe Leu Gin Leu 
35 40 45 

GTT GAA ATT AAG GAA GAC AAT CCT AAG TTG GCC TTG GTA GTG GAT ATG 192 
Val Glu He Lys Glu Asp Asn Pro Lys Leu Ala Leu Val Val Asp Met 
50 55 60 

CGT AGT CGT TCT TTC TTT TCA GAG ATT CAA GCT GTT TTG GAT GAG TTG 240 
Arg Ser Arg Ser Phe Phe Ser Glu He Gin Ala Val Leu Asp Glu Leu 
65 70 75 80 
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GAA AAT CAA GAT GGT TTG GAT TTC AAA ATC CTC TTT TTG GAT GCG GCT 288 

Glu Asn Gin Asp Gly Leu Asp Phe Lys lie Leu Phe Leu Asp Ala Ala 
85 90 95 

5 

GAT AAG GAA TTG GTC GCT CGT TAC AAG GAA ACC AGA CGG AGT CAC CCA 336 

Asp Lys Glu Leu Val Ala Arg Tyr Lys Glu Thr Arg Arg Ser His Pro 
100 105 110 

10 CTA GCA GCA GAC GGT CGT ATT TTA GAT GGA ATC AAG TTG GAA CGT GAA 384 

Leu Ala Ala Asp Gly Arg lie Leu Asp Gly lie Lys Leu Glu Arg Glu 
115 120 125 

CTC TTG GCA CCT TTG AAA AAT ATG AGC CAA AAT GTG GTG GAT ACG ACT 432 

15 Leu Leu Ala Pro Leu Lys Asn Met Ser Gin Asn Val Val Asp Thr Thr 
130 135 140 

GAA CTC ACT CCA CGT GAG CTG CGC AAA ACC CTT GCA GAG CAG TTT TCA 480 

Glu Leu Thr Pro Arg Glu Leu Arg Lys Thr Leu Ala Glu Gin Phe Ser 

20 145 150 155 160 

GAC CAA GAA CAA GCT CAG TCT TTC CGT ATC GAA GTC ATG TCT TTC GGA 528 

Asp Gin Glu Gin Ala Gin Ser Phe Arg lie Glu Val Met Ser Phe Gly 
165 170 175 



25 



TTT AAG TAT GGA ATC CCG ATT GAT GCG GAC TTG GTC TTT GAT GTC CGT 576 
Phe Lys Tyr Gly . lie Pro lie Asp Ala Asp Leu Val Phe Asp Val Arg 
180 185 190 



30 TTC TTG CCA AAT CCC TAT TAT TTA CCA GAA CTG AGA AAC CAA ACG GGT 624 
Phe Leu Pro Asn Pro Tyr Tyr Leu Pro Glu Leu Arg Asn Gin Thr Gly 
195 200 205 

GTG GAT GAA CCT GTT TAT GAT TAT GTC ATG AAC CAT CCT GAG TCA GAA 672 
35 Val Asp Glu Pro Val Tyr Asp Tyr Val Met Asn His Pro Glu Ser Glu 
210 215 220 

GAC TTT TAT CAA CAT TTA TTG GCC TTG ATT GAG CCG ATT CTG CCA AGT 720 
Asp Phe Tyr Gin His Leu Leu Ala Leu lie Glu Pro lie Leu Pro Ser 
40 225 230 235 240 

TAC CAA AAG GAA GGT AAG TCC GTT TTG ACC ATT GCC ATG GGA TGT ACG 768 
Tyr Gin Lys Glu Gly Lys Ser Val Leu Thr lie Ala Met Gly Cys Thr 
245 250 255 

45 

GGT GGA CAA CAC CGT AGT GTG GCA TTT GCT AAA CGC TTG GTG CAG GAC 816 
Gly Gly Gin His Arg Ser Val Ala Phe Ala Lys Arg Leu Val Gin Asp 
260 265 270 

50 TTA TCC AAG AAT TGG TCT GTT AAT GAA GGG CAT CGC GAC AAA GAC CGC 864 
Leu Ser Lys Asn Trp Ser Val Asn Glu Gly His Arg Asp Lys Asp Arg 
275 280 285 

AGA AAG GAA ACG GTA AAC CGT TCA TGA 891 
55 Arg Lys Glu Thr Val Asn Arg Ser * 
290 295 



60 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 296 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

Met Thr Lys Lys Gin Leu His Leu Val He Val Thr Gly Met Gly Gly 
10 1 5 10 15 

Ala Gly Lys Thr Val Ala He Gin Ser Phe Glu Asp Leu Gly Tyr Phe 
20 25 30 

15 

Thr He Asp Asn Met Pro Pro Ala Leu Leu Pro Lys Phe Leu Gin Leu 
35 40 45 

Val Glu He Lys Glu Asp Asn Pro Lys Leu Ala Leu Val Val Asp Met 
20 50 55 60 

Arg Ser Arg Ser Phe Phe Ser Glu He Gin Ala Val Leu Asp Glu Leu 
65 "70 75 80 

25 Glu Asn Gin Asp Gly Leu Asp Phe Lys He Leu Phe Leu Asp Ala Ala 

85 90 95 

Asp Lys Glu Leu Val Ala Arg Tyr Lys Glu Thr Arg Arg Ser His Pro 
100 105 110 

30 

Leu Ala Ala Asp Gly Arg He Leu Asp Gly He Lys Leu Glu Arg Glu 
115 120 125 

Leu Leu Ala Pro Leu Lys Asn Met Ser Gin Asn Val Val Asp Thr Thr 
35 130 135 140 

Glu Leu Thr Pro Arg Glu Leu Arg Lys Thr Leu Ala Glu Gin Phe Ser 
145 150 155 160 

40 Asp Gin Glu Gin Ala Gin Ser Phe Arg He Glu Val Met Ser Phe Gly 

165 170 175 

Phe Lys Tyr Gly He Pro He Asp Ala Asp Leu Val Phe Asp Val Arg 
180 185 190 

45 

Phe Leu Pro Asn Pro Tyr Tyr Leu Pro Glu Leu Arg Asn Gin Thr Gly 
195 200 205 

Val Asp Glu Pro Val Tyr Asp Tyr Val Met Asn His Pro Glu Ser Glu 
50 210 215 220 

Asp Phe Tyr Gin His Leu Leu Ala Leu He Glu Pro He Leu Pro Ser 
225 230 235 240 

55 Tyr Gin Lys Glu Gly Lys Ser Val Leu Thr lie Ala Met Gly Cys Thr 

245 250 255 

Gly Gly Gin His Arg Ser Val Ala Phe Ala Lys Arg Leu Val Gin Asp 

260 265 270 

60 

Leu Ser Lys Asn Trp Ser Val Asn Glu Gly His Arg Asp Lys Asp Arg 
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275 280 285 

Arg Lys Glu Thr Val Asn Arg Ser 
290 295 



(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 329 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

Met Val Glu Val Pro Asp Glu Arg Leu Gin Lys Leu Thr Glu Met lie 
15 10 15 

Thr Pro Lys Lys Thr Val Pro Thr Thr Phe Glu Phe Thr Asp lie Ala 
20 25 30 

Gly lie Val Lys Gly Ala Ser Lys Gly Glu Gly Leu Gly Asn Lys Phe 
35 40 45 

Leu Ala Asn He Arg Glu Val Asp Ala He Val His Val Val Arg Ala 
50 55 60 

Phe Asp Asp Glu Asn Val Met Arg Glu Gin Gly Arg Glu Asp Ala Phe 
65 70 75 80 

Val Asp Pro Leu Ala Asp He Asp Thr He Asn Leu Glu Leu He Leu 
85 90 95 

Ala Asp Leu Glu Ser Val Asn Lys Arg Tyr Ala Arg Val Glu Lys Met 
100 105 110 

Ala Arg Thr Gin Lys Asp Lys Glu Ser Val Ala Glu Phe Asn Val Leu 
115 120 125 

Gin Lys He Lys Pro Val Leu Glu Asp Gly Lys Ser Ala Arg Thr He 
130 135 140 

Glu Phe Thr Asp Glu Glu Gin Lys Val Val Lys Gly Leu Phe Leu Leu 
145 150 155 160 

Thr Thr Lys Pro Val Leu Tyr Val Ala Asn Val Asp Glu Asp Val Val 
165 170 175 

Ser Glu Pro Asp Ser He Asp Tyr Val Lys Gin He Arg Glu Phe Ala 
180 185 190 

Ala Thr Glu Asn Ala Glu Val Val Val He Ser Ala Arg Ala Glu Glu 
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195 200 205 

Glu lie Ser Glu Leu Asp Asp Glu Asp Lys Lys Glu Phe Leu Glu Ala 
210 215 220 

5 

lie Gly Leu Thr Glu Ser Gly Val Asp Lys Leu Thr Arg Ala Ala Tyr 
225 230 235 240 

His Leu Leu Gly Leu Gly Thr Tyr Phe Thr Ala Gly Glu Lys Glu Val 
10 245 250 255 

Arg Ala Trp Thr Phe Lys Arg Gly Met Lys Ala Pro Gin Ala Ala Gly 
260 265 270 

15 He He His Ser Asp Phe Glu Lys Gly Phe He Arg Ala Val Thr Met 

275 280 285 



20 



40 



55 



Ser Tyr Glu Asp Leu Val Lys Tyr Gly Ser Glu Lys Ala Val Lys Glu 
290 295 300 

Ala Gly Arg Leu Arg Glu Glu Gly Lys Glu Tyr He Val Gin Asp Gly 
305 310 315 320 



Asp He Met Glu Phe Arg Phe Asn Val 
25 325 

(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 189 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

35 (ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 



Met Ser Ala Ser Glu Gly Arg Asp Pro Tyr Glu Asp Tyr Leu Ala He 
45 1 5 10 15 

Asn Lys Glu Leu Glu Ser Tyr Asn Leu Arg Leu Met Glu Arg Pro Gin 
20 25 30 

50 He He Val Thr Asn Lys Met Asp Met Pro Glu Ser Gin Glu Asn Leu 

35 40 45 



Glu Glu Phe Lys Lys Lys Leu Ala Glu Asn Tyr Asp Glu Phe Glu Glu 
50 55 60 

Leu Pro Ala He Phe Pro He Ser Gly Leu Thr Lys Gin Gly Leu Ala 
65 70 75 80 



Thr Leu Leu Asp Ala Thr Ala Glu Leu Leu Asp Lys Thr Pro Glu Phe 
60 85 90 95 



WO 98/26072 



PCT/US97/22578 



-212- 

Leu Leu Tyr Asp Glu Ser Asp Met Glu Glu Glu Val Tyr Tyr Gly Phe 
100 105 110 

Asp Glu Glu Glu Lys Ala Phe Glu lie Ser Arg Asp Asp Asp Ala Thr 
115 120 125 

Trp Val Leu Ser Gly Glu Lys Leu Met Lys Leu Phe Asn Met Thr Asn 
130 135 140 

Phe Asp Arg Asp Glu Ser Val Met Lys Phe Ala Arg Gin Leu Arg Gly 
145 150 155 160 

Met Gly Val Asp Glu Ala Leu Arg Ala Arg Gly Ala Lys Asp Gly Asp 
165 170 175 

Leu Val Arg He Gly Lys Phe Glu Phe Glu Phe Val Asp 
180 185 

(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 226 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:125: 

Met Asn He Gin Gin Leu Arg Tyr Val Val Ala He Ala Asn Ser Gly 
15 10 15 

Thr Phe Arg Glu Ala Ala Glu Lys Met Tyr Val Ser Gin Pro Ser Leu 
20 25 30 

Ser He Ser Val Arg Asp Leu Glu Lys Glu Leu Gly Phe Lys He Phe 
35 40 45 

Arg Arg Thr Ser Ser Gly Thr Phe Leu Thr Arg Arg Gly Met Glu Phe 
50 55 60 

Tyr Glu Lys Ala Gin Glu Leu Val Lys Gly Phe Asp He Phe Gin Asn 
65 70 75 80 

Gin Tyr Ala Asn Pro Glu Glu Glu Lys Asp Glu Phe Ser Val Ala Ser 
85 90 95 

Gin His Tyr Asp Phe Leu Pro Pro Thr He Thr Ala Phe Ser Glu Arg 
100 105 110 

Tyr Pro Asp Tyr Lys Asn Phe Arg He Phe Glu Ser Thr Thr Val Gin 
115 120 125 

He Leu Asp Glu Val Ala Gin Gly His Ser Glu He Gly He He Tyr 
130 135 140 
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Leu Asn Asn Gin Asn Lys Lys Gly He Met Gin Arg Val Glu Lys Leu 
145 150 155 160 

Gly Leu Glu Val He Glu Leu He Pro Phe His Thr His He Tyr Leu 
165 170 175 

Cys Glu Gly His Pro Leu Ala Gin Lys Glu Glu Leu Val Met Glu Asp 
180 185 190 

Leu Ala Asp Leu Pro Thr Val Arg Phe Thr Gin Glu Lys Asp Glu Tyr 
195 200 205 

Leu Tyr Tyr Ser Glu Asn Phe Val Asp Thr Ser Ala Thr His Arg Cys 
210 * 215 220 

Leu Met 
225 

(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

Met Lys Lys Arg Ala He Val Ala Val He Val Leu Leu Leu He Gly 
1 5 10 15 

Leu Asp Gin Leu Val Lys Ser Tyr He Val Gin Gin He Pro Leu Gly 
20 25 30 

Glu Val Arg Ser Trp He Pro Asn Phe Val Ser Leu Thr Tyr Leu Gin 
35 40 45 

Asn Arg Gly Ala Ala Phe Ser He Leu Gin Asp Gin Gin Leu Leu Phe 
50 55 60 

Ala Val He Thr Leu Val Val Val He Gly Ala He Trp Tyr Leu His 
65 70 75 80 

Lys His Met Glu Asp Ser Phe Trp Met Val Leu Gly Leu Thr Leu He 
85 90 95 

He Ala Gly Gly Pro Gly Asn Phe He Asp Arg Val Ser Gin Gly Phe 
100 105 110 



Val Val Asp Met Phe His Leu 
115 

(2) INFORMATION FOR SEQ ID NO: 127: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 amino acids 

(B) TYPE: amino acid 

5 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

10 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Met Ser Lys Tyr Leu Leu Lys Leu Leu Val Tyr Cys Phe Ser Ala Leu 
15 10 15 

20 Thr Phe Gly Ser Leu Phe Leu lie He Gly Phe He Leu He Lys Gly 

20 25 30 

Leu Pro His Leu Ser Leu Ser Leu Phe Ser Trp Thr Tyr Thr Ser Glu 
35 40 45 

Asn He Ser Leu Met Pro Ala He He Ser Thr Val He Leu Val Phe 
50 55 60 



25 



Gly Ala Leu Leu Leu Ala Leu Pro He Gly He Phe Ala Gly Phe Tyr 

30 65 70 75 80 

Leu Val Glu Tyr Thr Lys Lys Asp Ser Leu Cys Val Lys He Met Arg 
85 90 95 

35 Leu Ala Ser Asp Thr Leu Ser Gly He Pro Ser He Val Phe Gly Leu 

100 105 110 



40 



Phe Gly Met Leu Phe Phe Val Val Phe Leu Gly Phe Gin Tyr Ser Leu 
115 120 125 

Leu Ser Gly He Leu Thr Ser Val He Met Val Leu Pro Val He He 
130 135 140 



Arg Ser Thr Glu Glu Ala Leu Leu Ser Val Ser Asp Ser Met Arg Gin 
45 145 150 155 160 

Ala Ser Tyr Gly Leu Gly Ala Leu Ser Tyr 
165 170 

50 (2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 amino acids 

(B) TYPE: amino acid 

55 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

60 (iii) HYPOTHETICAL: NO 



WO 98726072 



PCT/US97/22578 



-215- 



(iv) ANTI-SENSE: NO 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

Met Lys Thr Glu Gin Thr Ala Ser Lys Thr Ser Ala Leu Lys Gly Lys 
15 10 15 

Glu Val Ala Asp Phe Glu Leu Met Gly Val Asp Gly Lys Thr Tyr Arg 
20 • 25 30 

Leu Ser Asp Tyr Lys Gly Lys Lys Val Tyr Leu Lys Phe Trp Ala Ser 
35 40 45 

Trp Cys Ser lie Cys Leu Ala Ser Leu Pro Asp Thr Asp Glu lie Ala 
50 55 60 

Lys Glu Ala Gly Asp Asp Tyr Val Val Leu Thr Val Val Ser Pro Gly 
65 70 75 80 

His Lys Gly Glu Gin Ser Glu Ala Asp Phe Lys Asn Trp Tyr Lys Gly 
85 90 95 

Leu Asp Tyr Lys Asn Leu Pro Val Leu Val Asp Pro Ser Gly Lys Leu 
100 105 110 

Leu Glu Thr Tyr Gly Val Arg Ser Tyr Pro Thr Gin Ala Phe lie Asp 
115 120 125 

Lys Glu Gly Lys Leu Val Lys Thr His Pro Gly Phe Met Glu Lys Asp 
130 135 140 

Ala He Leu Gin Thr Leu Lys Glu Leu Ser 
145 150 

(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:129: 

Met Lys Lys Glu Gin He Pro Asn Leu Leu Thr He Gly Arg He Leu 
15 10 15 

Phe He Pro He Phe He Phe He Leu Thr He Gly Asn Ser He Glu 
20 25 30 

Ser His He Val Ala Ala He He Phe Ala Val Ala Ser He Thr Asp 
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35 40 45 

Tyr Leu Asp Gly Tyr Leu Ala Arg Lys Trp Asn Val Val Ser Asn Phe 
50 55 60 

Gly Lys Phe Ala Asp Pro Met Ala Asp Lys Leu Leu Val Met Ser Ala 
65 70 75 80 

Phe lie Met Leu lie Glu Leu Gly Met Ala Pro Ala Trp He Val Ala 
85 90 95 

Val He He Cys Arg Glu Leu Ala Val Thr Gly Leu Arg Leu Leu Leu 
100 105 110 

Val Glu Thr Gly Gly Thr He Leu Ala Ala Ala Met Pro Gly Lys He 
115 120 125 

Lys Thr Phe Ser Gin Met Phe Ala He He Phe Leu Leu Leu His Trp 
130 135 140 

Thr Leu Leu Gly Gin Val Leu Leu Tyr Val Ala Leu Phe Phe Thr He 
145 150 155 160 

Tyr Ser Gly Tyr Asp Tyr Phe Lys Gly Ser Ala Tyr Val Phe Lys Gly 
165 170 175 

Thr Phe Gly Ser Lys 
180 

(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 111 ainino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Leu Arg Leu Lys Glu Met Asn Gly Asp Met He His Ala Ala Tyr Asp 
15 10 15 

Leu Gly Ala Ser Gin Phe Gin Met Phe Lys Glu He Met Leu Pro Tyr 
20 25 30 

Leu Thr Pro Ser He He Ala Gly Tyr Phe Met Ala Phe Thr Tyr Ser 
35 40 45 

Leu Asp Asp Phe Ala Val Thr Phe Phe Val Thr Gly Asn Gly Phe Ser 
50 55 60 

Thr Leu Ser Val Glu He Tyr Ser Arg Ala Arg Lys Gly He Ser Leu 
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65 70 



75 80 



Glu lie Asn Ala Leu Ser Ala Leu Val Phe Leu Phe Ser He He Leu 
85 90 95 

Val Val Gly Tyr Tyr Phe He Ser Arg Glu Lys Glu Glu Gin Ala 
100 105 HO 

(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Pro Gin Phe Thr Glu Glu Thr Gly He Gin Val Gin Tyr Glu Ala Phe 
1 5 10 15 

Asp Ser Asn Glu Ala Met Tyr Thr Lys He Lys Gin Gly Gly Thr Thr 
20 25 30 

Tvr Asp He Ala He Pro Ser Glu Tyr Met He Asn Lys Met Lys Asp 
35 40 45 

Glu Asp Leu Leu Val Pro Leu Asp Tyr Ser Lys 
50 55 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 232 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

Met Gin Thr Gin Glu Lys His Ser Gin Ala Ala Val Leu Gly Leu Gin 
15 10 15 

His Leu Leu Ala Met Tyr Ser Gly Ser He Leu Val Pro He Met He 
20 25 30 

Ala Thr Ala Leu Gly Tyr Ser Ala Glu Gin Leu Thr Tyr Leu He Ser 
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35 40 45 

Thr Asp He Phe Met Cys Gly Val Ala Thr Phe Leu Gin Leu Gin Leu 
50 55 60 

Asn Lys Tyr Phe Gly He Gly Leu Pro Val Val Leu Gly Val Ala Phe 
65 70 75 80 

Gin Ser Val Ala Pro Leu He Met He Gly Gin Ser His Gly Ser Gly 
85 90 95 

Ala Met Phe Gly Ala Leu He Ala Ser Gly He Tyr Val Val Leu Val 
100 105 110 

Ser Gly He Phe Ser Lys Val Ala Asn Leu Phe Pro Ser He Val Thr 
115 120 125 

Gly Ser Val He Thr Thr He Gly Leu Thr Leu He Pro Val Ala He 
130 135 140 

Gly Asn Met Gly Asn Asn Val Pro Glu Pro Thr Gly Gin Ser Leu Leu 
145 150 155 160 

Leu Ala Ala He Thr Val Leu He He Leu Leu He Asn He Phe Thr 
165 170 175 

Lys Gly Phe He Lys Ser He Ser He Leu He Gly Leu Val Val Gly 
180 185 190 

Thr Ala He Ala Ala Thr Met Gly Leu Val Asp Phe Ser Pro Val Ala 
195 200 205 

Val Val His Leu Ser Met Ser Gin Leu His Ser Thr Leu Gly Cys Gin 
210 215 220 

Pro Leu Lys Ser His Leu Leu Ser 
225 230 

(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 343 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

Lys Val Pro Val Tyr Leu Gly Ser Ser Phe Ala Phe He Thr Ala Met 
15 10 15 

Ser Leu Ala Met Lys Glu Met Gly Gly Asp Val Ser Ala Ala Gin Thr 
20 25 30 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



Gly Val He Leu Thr Gly Leu Val Tyr Val Leu Val Ala Thr Ser He 
35 40 45 

Arg Phe Val Gly Thr Lys Trp He Asp Lys Leu Leu Pro Pro He He 
50 55 60 

He Gly Pro Met He He Val He Gly Leu Gly Leu Ala Gly Ser Ala 
65 70 75 80 

Val Thr Asn Ala Gly Leu Val Ala Asp Gly Asn Trp Lys Asn Ala Leu 
85 90 95 

Val Ala Val Val Thr Phe Leu He Ala Ala Phe He Asn Thr Lys Gly 
100 105 HO 

Lys Gly Phe Leu Arg He He Pro Phe Leu Phe Ala He He Gly Gly 
115 120 125 

Tyr Leu Phe Ala Leu Thr Leu Gly Leu Val Asp Phe Thr Pro Val Leu 
130 135 140 

Lys Ala Asn Trp Phe Glu He Pro Gly Phe Tyr Leu Pro Phe Ser Thr 
145 150 155 160 

Gly Gly Ala Phe Lys Glu Tyr Asn Leu Tyr Phe Gly Pro Glu Ala He 
165 170 175 

Ala He Leu Pro He Ala He Val Thr He Ser Glu His He Gly Asp 
180 185 190 

His Thr Val Leu Gly Gin He Cys Gly Arg Gin Phe Leu Lys Glu Pro 
195 200 205 

Gly Leu His Arg Thr Leu Leu Gly Asp Gly He Ala Thr Ser Val Ser 
210 215 220 

Ala Phe Leu Gly Gly Pro Ala Asn Thr Thr Tyr Gly Glu Asn Thr Gly 
225 230 ,235 240 

Val He Gly Met Thr Arg He Ala Ser Val Ser Val He Arg Asn Ala 
245 250 255 

Ala Phe He Ala He Ala Leu Ser Phe Leu Gly Lys Phe Thr Ala Leu 
260 265 270 

He Ser Thr He Pro Asn Ala Val Leu Gly Gly Met Ser He Leu Leu 
275 280 285 

Tyr Gly Val He Ala Ser Asn Gly Leu Lys Val Leu He Lys Glu Arg 
290 295 300 

Val Asp Phe Ala Gin Met Arg Asn Leu He He Ala Ser Ala Met Leu 
305 310 315 320 

Val Leu Gly Leu Gly Gly Ala He Leu Lys Leu Gly Pro Val His Phe 
325 330 335 

Gin Val Leu Pro Phe Gin Pro 



340 



(2) 



INFORMATION FOR SEQ ID NO: 134: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 184 amino acids 

(B) TYPE: amino acid 

5 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

10 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

Ser Leu lie He Ala Leu Ala Thr Thr Leu He Ala He He He Ser 
15 10 15 

20 Ala Met Ala Ala Tyr Gly He Val Arg Phe Phe Pro Lys Leu Gly Ala 

20 25 30 



25 



40 



60 



He Met Ser Arg Leu Leu Val He Thr Tyr He Phe Pro Pro He Leu 
35 40 45 

Leu Ala He Pro Tyr Ser He Ala He Ala Lys Val Gly Leu Thr Asn 
50 55 60 



Ser Leu Phe Gly Leu Met Met Val Tyr Leu Ser Phe Ser Val Pro Tyr 

30 65 70 75 80 

Ala Val Trp Leu Leu Val Gly Phe Phe Gin Thr Val Pro He Gly He 

85 90 95 

35 Glu Glu Ala Ala Arg He Asp Gly Ala Asn Lys Phe Val Thr Phe Tyr 

100 105 110 



Lys Val Val Leu Pro He Val Ala Pro Gly He Val Ala Thr Ala He 
115 120 125 

Tyr Thr Phe He Asn Ala Trp Asn Glu Phe Leu Tyr Ala Leu He Leu 
130 135 140 



He Asn Asn Thr Gly Lys Met Thr Val Ala Val Ala Leu Arg Ser Leu 
45 145 150 155 160 

Asn Gly Ser Glu He Leu Asp Trp Gly Asp Met Met Ala Ala Ser Val 
165 170 175 

50 He Val Val Leu Pro Ser He He 

180 

(2) INFORMATION FOR SEQ ID NO: 135: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 155 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: peptide 



10 



20 



25 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

Asp Glu Leu Ala Asp Leu Met Met Val Ala Ser Lys Glu Val Glu Asp 
15 10 15 

Ala He He Arg Leu Gly Gin Lys Ala Arg Ala Ala Gly He His Met 
20 25 30 



He Leu Ala Thr Gin Arg Pro Ser Val Asp Val He Ser Gly Leu He 
15 35 40 45 



Lys Ala Asn Val Pro Ser Arg Val Ala Phe Ala Val Ser Ser Gly Thr 
50 55 60 

Asp Ser Arg Thr He Leu Asp Glu Asn Gly Ala Glu Lys Leu Leu Gly 
65 70 75 80 

Arg Gly Asp Met Leu Phe Lys Pro He Asp Glu Asn His Pro Val Arg 
85 90 95 

Leu Gin Gly Ser Phe lie Ser Asp Asp Asp Val Glu Arg He Val Asn 
100 105 HO 



Phe He Lys Thr Gin Ala Asp Ala Asp Tyr Asp Glu Ser Phe Asp Pro 
30 115 120 125 

Gly Glu Val Ser Glu Asn Glu Gly Glu Phe Ser Asp Gly Asp Ala Gly 
130 135 140 

35 Gly Asp Pro Leu Phe Glu Glu Ala Lys Ser Leu 

145 150 155 

(2) INFORMATION FOR SEQ ID NO: 136: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

45 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
50 (iv) ANTI-SENSE: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

Met Thr Glu Asn Thr Pro Lys Ala Leu Val Gin Val Asn Gin Lys Pro 
15 io is 

Leu He Glu Tyr Gin He Glu Phe Leu Lys Glu Lys Gly He Asn Asp 
20 25 30 
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Ile lie He He Val Gly Tyr Leu Lys Glu Gin Phe Asp Tyr Leu Lys 
35 40 45 

Glu Lys Tyr Gly Val Arg Leu Val Phe Asn Asp Lys Tyr Ala Asp Tyr 
50 55 60 

Asn Asn Phe Tyr Ser Leu Tyr Leu Val Lys Glu Glu Leu Ala Asn Ser 
65 70 75 80 

Tyr Val He Asp Ala Asp Asn Tyr Leu Phe Lys Asn Met Phe Arg Asn 
85 90 95 

Asp Leu Thr Arg Ser Thr Tyr Phe Ser Val Tyr Arg Glu Asp Cys Thr 
100 105 HO 

Asn Glu Trp Phe Leu Val Tyr Gly Asp Asp Tyr Lys Val Gin Asp He 
115 120 125 

He Val Asp Ser Lys Ala Gly Arg He Leu Ser Gly Val Ser Phe Trp 
130 135 140 

Asp Ala Pro Thr Ala Glu Lys He Val Ser 
145. 150 

(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 286 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
<D) TOPOLOGY: not relevant 

<ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

Met Ser Asp Asn Ser Lys Thr Arg Val Val Val Gly Met Ser Gly Gly 
1 5 10 15 

Val Asp Ser Ser Val Thr Ala Leu Leu Leu Lys Glu Gin Gly Tyr Asp 
20 25 30 

Val He Gly He Phe Met Lys Asn Trp Asp Asp Thr Asp Glu Asn Gly 
35 40 45 

Val Cys Thr Ala Thr Glu Asp Tyr Lys Asp Val Val Ala Val Ala Asp 
50 55 60 

Gin He Gly He Pro Tyr Tyr Ser Val Asn Phe Glu Lys Glu Tyr Trp 
65 70 75 80 

Asp Arg Val Phe Glu Tyr Phe Leu Ala Glu Tyr Arg Ala Gly Arg Thr 
85 90 95 
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Pro Asn Pro Asp Val Met Cys Asn Lys Glu He Lys Phe Lys Ala Phe 
100 105 110 

Leu Asp Tyr Ala Met Thr Leu Gly Ala Asp Tyr Val Ala Thr Gly His 
115 120 125 

Tyr Ala Arg Val Ala Arg Asp Glu Asp Gly Thr Val His Met Leu Arg 
130 135 140 

Gly Val Asp Asn Gly Lys Asp Gin Thr Tyr Phe Leu Ser Gin Leu Ser 
145 150 155 160 



Gin Glu Gin Leu Gin Lys Thr Met Phe Pro Leu Gly His Leu Lys Lys 
15 165 170 175 



Pro Glu Val Arg Lys Leu Ala Glu Glu Ala Gly Leu Ser Thr Ala Lys 
180 185 190 

Lys Lys Asp Ser Thr Gly He Cys Phe He Gly Glu Lys Asn Phe Lys 
195 200 205 

Asn Phe Leu Ser Asn Tyr Leu Pro Ala Gin Pro Gly Arg Met Met Thr 
210 215 220 

Val Asp Gly Arg Asp Met Gly Glu His Ala Gly Leu Met Tyr Tyr Thr 
225 230 235 240 



He Gly Gin Arg Gly Gly Leu Gly He Gly Gly Gin His Gly Gly Asp 
30 245 250 255 

Asn Ala Pro Trp Phe Val Val Gly Lys Asp Leu Ser Lys Asn He Leu 
260 265 270 

35 Tyr Val Gly Gin Gly Phe Tyr His Asp Ser Leu Met Ser Thr 

275 280 285 

(2) INFORMATION FOR SEQ ID NO: 138: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 648 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

45 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
50 (iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 

55 Met Glu Val Phe Glu Ser Leu Lys Ala Asn Leu Val Gly Lys Asn Ala 

15 10 15 



60 



Arg He Val Leu Pro Glu Gly Glu Glu Pro Arg He Leu Gin Ala Thr 
20 25 30 

Lys Arg Leu Val Lys Glu Thr Glu Val He Pro Val Leu Leu Gly Asn 
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35 



40 



45 



Pro Glu Lys lie Lys lie Tyr Leu Glu lie Glu Gly lie Met Asp Gly 
50 55 60 

Tyr Glu Val lie Asp Pro Gin His Tyr Pro Gin Phe Glu Glu Met Val 
65 70 75 80 

Ser Ala Leu Val Glu Arg Arg Lys Gly Lys Met Thr Glu Glu Asp Val 
85 90 95 

Arg Lys Val Leu Val Glu Asp Val Asn Tyr Phe Gly Val Met Leu Val 
100 105 110 

Tyr Leu Gly Leu Val Asp Gly Met Val Ser Gly Ala lie His Ser Thr 
115 120 125 

Ala Ser Thr Val Arg Pro Ala Leu Gin lie lie Lys Thr Arg Pro Asn 
130 135 140 

Val Thr Arg Thr Ser Gly Ala Phe Leu Met Val Arg Gly Thr Glu Arg 
145 150 155 160 

Tyr Leu Phe Gly Asp Cys Ala lie Asn He Asn Pro Asp Ala Glu Ala 
165 170 175 

Leu Ala Glu He Ala He Asn Ser Ala He Thr Ala Lys Met Phe Gly 
180 185 190 

He Glu Pro Lys He Ala Met Leu Ser Tyr Ser Thr Lys Gly Ser Gly 
195 200 205 

Phe Gly Glu Ser Val Asp Lys Val Val Glu Ala Thr Lys He Ala His 
210 215 220 

Asp Leu Arg Pro Asp Leu Glu He Asp Gly Glu Leu Gin Phe Asp Ala 
225 230 235 240 

Ala Phe Val Pro Glu Thr Ala Ala Leu Lys Ala Pro Gly Ser Thr Val 
245 250 255 

Ala Gly Gin Ala Asn Val Phe He Phe Pro Gly He Glu Ala Gly Asn 
260 265 270 

He Gly Tyr Lys Met Ala Glu Arg Leu Gly Gly Phe Ala Ala Val Gly 
275 280 285 

Pro Val Leu Gin Gly Leu Asn Lys Pro Val Asn Asp Leu Ser Arg Gly 
290 295 300 

Cys Asn Ala Asp Asp Val Tyr Lys Leu Thr Leu He Thr Ala Ala Gin 
305 310 315 320 

Ala Val His Gin Met Glu Val Phe Glu Ser Leu Lys Ala Asn Leu Val 
325 330 335 

Gly Lys Asn Ala Arg He Val Leu Pro Glu Gly Glu Glu Pro Arg He 
340 345 350 



Leu Gin Ala Thr Lys Arg Leu Val Lys Glu Thr Glu Val He Pro Val 
355 360 365 
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Leu Leu Gly Asn Pro Glu Lys He Lys He Tyr Leu Glu He Glu Gly 
370 375 380 

He Met Asp Gly Tyr Glu Val He Asp Pro Gin His Tyr Pro Gin Phe 
385 390 395 400 

Glu Glu Met Val Ser Ala Leu Val Glu Arg Arg Lys Gly Lys Met Thr 
405 410 415 

Glu Glu Asp Val Arg Lys Val Leu Val Glu Asp Val Asn Tyr Phe Gly 
420 425 430 

Val Met Leu Val Tyr Leu Gly Leu Val Asp Gly Met Val Ser Gly Ala 
435 440 445 

He His Ser Thr Ala Ser Thr Val Arg Pro Ala Leu Gin He He Lys 
450 455 460 

Thr Arg Pro Asn Val Thr Arg Thr Ser Gly Ala Phe Leu Met Val Arg 
465 470 475 480 

Gly Thr Glu Arg Tyr Leu Phe Gly Asp Cys Ala He Asn He Asn Pro 
485 490 495 

Asp Ala Glu Ala Leu Ala Glu He Ala He Asn Ser Ala He Thr Ala 
500 505 510 

Lys Met Phe Gly He Glu Pro Lys He Ala Met Leu Ser Tyr Ser Thr 
515 520 525 

Lys Gly Ser Gly Phe Gly Glu Ser Val Asp Lys Val Val Glu Ala Thr 
530 535 540 

Lys He Ala His Asp Leu Arg Pro Asp Leu Glu He Asp Gly Glu Leu 
545 550 555 560 

Gin Phe Asp Ala Ala Phe Val Pro Glu Thr Ala Ala Leu Lys Ala Pro 
565 570 575 

Gly Ser Thr Val Ala Gly Gin Ala Asn Val Phe He Phe Pro Gly He 
580 585 590 

Glu Ala Gly Asn He Gly Tyr Lys Met Ala Glu Arg Leu Gly Gly Phe 
595 600 605 

Ala Ala Val Gly Pro Val Leu Gin Gly Leu Asn Lys Pro Val Asn Asp 
610 615 620 

Leu Ser Arg Gly Cys Asn Ala Asp Asp Val Tyr Lys Leu Thr Leu He 
625 630 635 640 

Thr Ala Ala Gin Ala Val His Gin 
645 

(2) INFORMATION FOR SEQ ID NO: 139 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 135 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
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(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Met Arg Asn Leu Lys Ser lie Leu Arg Arg His He Ser Leu Leu Gly 
15 10 15 

Phe Leu Gly Val Leu Ser He Trp Gin Leu Ala Gly Phe Leu Lys Leu 
20 25 30 

Leu Pro Lys Phe He Leu Pro Thr Pro Leu Glu He Leu Gin Pro Phe 
35 40 45 

Val Arg Asp Arg Glu Phe Leu Trp His His Ser Trp Ala Thr Leu Arg 
50 55 60 

Val Ala Leu Leu Gly Leu He Leu Gly Val Leu He Ala Cys Leu Met 
65 70 75 80 

Ala Val Leu Met Asp Ser Leu Thr Trp Leu Asn Asp Leu He Tyr Pro 
85 90 95 

Met Met Val Val He Gin Thr He Pro Thr He Ala He Ala Pro He 
100 105 110 

Leu Val Leu Trp Leu Gly Tyr Gly He Phe Ala Gin Asp Cys Leu Asp 
115 120 125 

Tyr Leu Asn Asn Asn Leu Ser 
130 135 

(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 279 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 

Pro Trp Ser Leu Val Asp Glu Tyr Glu Gin Leu Tyr Ala Thr lie Gly 
15 10 15 

Trp His Pro Thr Glu Ala Gly Thr Tyr Thr Glu Glu Val Glu Ala Tyr 
20 25 30 
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Leu Leu Asp Lys Leu Lys His Ser Lys Val Val Ala Leu Gly Glu lie 
35 40 45 

Gly Leu Asp Tyr His Trp Met Thr Ala Pro Glu Val Gin Glu Gin Val 
50 55 60 

Phe Arg Arg Gin lie Gin Leu Ser Lys Asp Leu Asp Leu Pro Phe Val 
65 70 75 80 

Val His Thr Arg Asp Ala Leu Glu Asp Thr Tyr Glu lie lie Lys Ser 
85 90 95 

Glu Gly Val Gly Pro Arg Gly Gly He Met His Ser Phe Ser Gly Thr 
100 105 110 

Leu Glu Trp Ala Arg Tyr Arg Asp Leu Gly Met Thr He Ser Phe Ser 
115 120 125 

Gly Val Val Thr Phe Lys Lys Ala Thr Asp Leu Gin Glu Ala Ala Lys 
130 135 140 

Glu Leu Pro Leu Asp Lys Met Leu Val Glu Thr Asp Ala Pro Tyr Leu 
145 150 155 160 

Ala Pro Val Pro Lys Arg Gly Arg Glu Asn Lys Thr Ala Tyr Thr Arg 
165 170 175 

Tyr Val Val Asp Phe He Ala Asp Leu Arg Gly Met Thr Thr Glu Glu 
180 185 190 

Leu Ala Val Ala Thr Thr Ala Asn Ala Glu Arg He Phe Gly He Gly 
195 200 205 

Gin Gin Val Met Lys Glu Arg He Ser Gin Val He Val Val Glu Gly 
210 215 220 

Arg Asp Asp Thr Val Asn Leu Lys Arg Tyr Phe Asp Val Glu Thr Tyr 
225 230 235 240 

Glu Thr Arg Gly Ser Ala He Asn Asp Gin Asp He Glu Arg He Gin 
245 250 255 

Arg Leu His Gin Arg His Gly Val He Val Phe Thr Asp Pro Asp Phe 
260 265 270 

Asn Gly Asp Gly Phe Gly Ala 
275 

(2) INFORMATION FOR SEQ ID NO: 141 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 147 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

Met Lys lie He He Gin Arg Val Lys Lys Ala Gin Val Ser He Glu 
15 10 15 

Gly Gin He Gin Gly Lys lie Asn Gin Gly Leu Leu Leu Leu Val Gly 
20 25 30 

Val Gly Pro Glu Asp Gin Glu Glu Asp Leu Asp Tyr Ala Val Arg Lys 
35 40 45 



Leu Val Asn Met Arg He Phe Ser Asp Ala Glu Gly Lys Met Asn Leu 

15 50 55 60 

Ser Val Lys Asp lie Glu Gly Glu He Leu Ser lie Ser Gin Phe Thr 

65 70 75 80 

20 Leu Phe Ala Asp Thr Lys Lys Gly Asn Arg Pro Ala Phe Thr Gly Ala 

85 90 95 



Ala Lys Pro Asp Met Ala Ser Asp Phe Tyr Asp Ala Phe Asn Gin Lys 
100 105 110 

Leu Ala Gin Glu Val Pro Val Gin Thr Gly lie Phe Gly Ala Asp Met 
115 120 125 



Gin Val Glu Leu Val Asn Asn Gly Pro Val Thr lie lie Leu Asp Thr 
30 130 135 140 

Lys Lys Arg 
145 

35 (2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 238 amino acids 

(B) TYPE: amino acid 

40 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

45 (iii) HYPOTHETICAL: NO 

(xv) ANTI-SENSE: NO 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

Met lie Leu Ser Met Val Ser Thr Pro Leu Pro Ser Ser Pro Cys Lys 
15 10 15 

55 Tyr Arg Lys Gin Leu Tyr Leu Gin Glu Asp Leu Arg Gly Lys Asn Val 

20 25 30 



60 



Glu Lys Val Lys Glu Leu Ala Thr Glu Lys Lys Val Ser He Ser Trp 
35 40 45 

Thr Ser Lys Lys Ser Leu Ser Glu Met Thr Glu Gly Ala Val His Gin 
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50 55 60 

Gly Phe, Val Leu Arg Val Ser Glu Phe Ala Tyr Ser Glu Leu Asp Tyr 
65 70 75 80 

He Leu Ala Lys Thr Arg Gin Glu Glu Asn Pro Leu Leu Leu He Leu 
85 90 95 

Asp Gly Leu Thr Asp Pro His Asn Leu Gly Ser He Leu Arg Thr Ala 
100 105 HO 

Asp Ala Thr Asn Val Ser Gly Val He He Pro Lys His Arg Ala Val 
115 120 125 

Gly Val Thr Pro Val Val Ala Lys Thr Ala Thr Gly Ala He Glu His 
130 135 140 

Val Pro He Ala Arg Val Thr Asn Leu Ser Gin Thr Leu Asp Lys Leu 
145 150 155 160 

Lys Asp Glu Gly Phe Trp Thr Phe Gly Thr Asp Met Asn Gly Thr Pro 
165 170 175 

Cys His Lys Trp Asn Thr Lys Gly Lys lie Ala Leu He He Gly Asn 
180 185 190 

Glu Gly Lys Gly He Ser Ser Asn He Lys Lys Gin Val Asp Glu Met 
195 200 205 

He Thr He Pro Met Asn Gly His Val Gin Ser Leu Asn Ala Ser Val 
210 215 220 

Ala Ala Ala He Leu Met Tyr Glu Val Phe Arg Asn Arg Leu 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 345 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY; not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 

Met Val Gin Gin Ala Ala Thr Val Ser Leu Met Val Leu Phe Leu Val 
15 10 15 

Pro Gin Leu Arg Asn Ala Tyr Gly Thr Ala Ala He Gly He He Cys 
20 25 30 

Gly Leu Tyr Trp Ala Val Ser Ser Asn Met Thr Val Glu Ala Thr Gin 
35 40 45 
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Arg Leu Thr Gly Gly Gly Gly Phe Ala lie Gly His Gin Gin Gin Phe 
50 55 60 

Ala He Trp Phe Val Asp Lys Val Ala Gly Arg Phe Gly Lys Lys Glu 
€5 70 75 80 

Glu Ser Leu Asp Asn Leu Lys Leu Pro Lys Phe Leu Ser He Phe His 
85 90 95 

Asp Thr Val Val Ala Ser Ala Thr Leu Met Leu Val Phe Phe Gly Ala 
100 105 HO 

He Leu Leu He Leu Gly Pro Asp He Met Ser Asn Lys Glu Val He 
115 120 125 

Thr Ser Gly Thr Leu Phe Asn Pro Ala Lys Gin Asp Phe Phe Met Tyr 
130 135 140 

He He Gin Thr Ala Phe Thr Phe Ser Val Tyr Leu Phe Val Leu Met 
145 150 155 160 

Gin Gly Val Arg Met Phe Val Ser Glu Leu Thr Asn Ala Phe Gin Gly 
165 170 175 

He Ser Asn Lys Leu Leu Pro Gly Ser Phe Pro Ala Val Asp Val Ala 
180 185 190 

Ala Ser Tyr Gly Phe Gly Ser Pro Asn Ala Val Leu Ser Gly Phe Thr 
195 200 205 

Phe Gly Leu He Gly Gin Leu He Thr He Val Leu Leu He Val Phe 
210 215 220 

Lys Asn Pro He Leu He He Thr Gly Phe Val Pro Val Phe Phe Asp 
225 230 235 240 

Asn Ala Ala He Ala Val Tyr Ala Asp Lys Arg Gly Gly Trp Lys Ala 
245 250 255 

Ala Val He Leu Ser Phe He Ser Gly Val Leu Gin Val Ala Leu Gly 
260 265 270 

Ala Leu Cys Val Ala Leu Leu Asp Leu Ala Ser Tyr Gly Gly Tyr His 
275 280 285 

Gly Asn He Asp Phe Glu Phe Pro Trp Leu Gly Phe Gly Tyr He Phe 
290 295 300 

Lys Tyr Leu Gly He Val Gly Tyr Val Leu Vai Cys Leu Phe Leu Leu 
305 310 315 320 

Val He Pro Gin Leu Gin Phe Ala Lys Ala Lys Asp Lys Glu Lys Tyr 
325 330 335 

Tyr Asn Gly Glu Val Gin Glu Glu Ala 
340 345 

(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 287 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
Uii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Met Val Arg Pro He Gly He Tyr Glu Lys Ala Thr Pro Thr His Phe 
15 10 15 



Thr Trp Leu Glu Arg Leu Asn Phe Ala Lys Glu Leu Gly Phe Asp Phe 
20 20 25 30 

Val Glu Met Ser He Asp Glu Arg Asp Glu Arg Leu Ala Arg Leu Asp 
35 40 45 

25 Trp Ser Lys Glu Glu Arg Leu Glu Val Val Lys Ala He Tyr Glu Thr 

50 55 60 



Gly Val Arg He Pro Ser He Cys Phe Ser Gly His Arg Arg Tyr Pro 
65 70 75 80 

Leu Gly Ser Lys Asp Pro Val Leu Glu Glu Lys Ser Leu Glu Leu Met 
85 90 95 



Lys Lys Cys He Glu Leu Ala Gin Asp Leu Gly Val Arg Thr He Gin 
35 100 105 110 

Leu Ala Gly Tyr Asp Val Tyr Tyr Glu Glu Lys Ser Pro Gin Thr Arg 
115 120 125 

40 Gin Arg Phe He Lys Asn Leu Arg Lys Ala Cys Asp Trp Ala Glu Glu 

130 135 140 



Ala Gin Val Val Leu Ala He Glu He Met Asp Asp Pro Phe He Asn 

145 150 155 160 

Ser He Glu Lys Tyr Leu Ala He Glu Lys Glu He Asp Ser Pro Phe 

165 170 175 



Leu Phe Val Tyr Pro Asp He Gly Asn Val Ser Ala Trp His Asn Asp 

50 180 185 190 

He Tyr Ser Glu Phe Tyr Leu Gly His His Ala He Ala Ala Leu His 
195 200 205 

55 Leu Lys Asp Thr Tyr Ala Val Thr Glu Ser Ser Lys Gly Gin Phe Arg 

210 215 220 



Asp Val Pro Phe Gly Gin Gly Cys Val Lys Trp Glu Glu Ala Phe Asp 
225 230 235 240 

He Leu Lys Glu Thr Asn Tyr Asn Gly Pro Phe Leu He Glu Met Trp 
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245 250 255 

Ser Glu Asn Cys Glu Thr Val Glu Glu Thr Arg Ala Ala Val Gin Glu 
260 265 270 

Ala Gin Ala Phe Leu Tyr Pro Leu He Lys Lys Ala Gly Leu Met 
275 2B0 285 

(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

<ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

Met Thr Lys Arg He Pro Asn Leu Gin Val Ala Leu Asp His Ser Asp 
15 10 15 

Leu Gin Gly Ala He Lys Ala Ala Val Ser Val Gly Gin Glu Val Asp 
20 25 30 

He He Glu Ala Gly Thr Val Cys Leu Leu Gin Val Gly Ser Glu Leu 
35 40 45 

Ala Glu Val Leu Arg Ser Leu Phe Pro Asp Lys He He Val Ala Asp 
50 55 60 

Thr Lys Cys Ala Asp Ala Gly Gly Thr Val Ala Lys Asn Asn Ala Val 
65 70 75 80 

Arg Gly Ala Asp Trp Met Thr Cys He Cys Cys Ala Thr He Pro Thr 
85 90 95 

Met Glu Ala Ala Leu Lys Ala He Lys Thr Glu Arg Gly Glu Arg Gly 
100 105 110 

Glu He Gin He Glu Leu Tyr Gly Asp Trp Thr Phe Glu Gin Ala Gin 
115 120 125 

Leu Trp Leu Asp Ala Gly He Ser Gin Ala He Tyr His Gin Ser Arg 
130 135 140 

Asp Ala Leu Leu Ala Gly Glu Thr Trp Gly Glu Lys Asp Leu Asn Lys 
145 150 155 160 

Val Lys Lys Leu He Asp Met Gly Phe Arg Val Ser Val Thr Gly Gly 
165 170 175 

Leu Asp Val Asp Thr Leu Lys Leu Phe Glu Gly Val Asp Val Phe Thr 
180 185 190 
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Phe He Ala Gly Arg Gly He Thr Glu Ala Ala Asp Pro Ala Gly Ala 
195 200 205 

Ala Arg Ala Phe Lys Asp Glu He Lys Arg He Trp Gly 
210 215 220 

(2) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 161 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

Met Asn Leu Lys Gin Ala Leu He Asp Asn Asp Ser He Arg Leu Gly 
15 10 15 

Leu Glu Ala Asn Asn Trp Lys Glu Ala Val Lys Val Ala Val Asp Pro 
20 25 30 

Leu He Glu Ser Gly Ala He Leu Pro Glu Tyr Tyr Asp Ala He He 
35 40 45 

Glu Ser Thr Glu Glu Tyr Gly Pro Tyr Tyr He Leu Met Pro Gly Met 
50 55 60 

Ala Met Pro His Ala Arg Pro Glu Ala Gly Val Gin Ser Asp Ala Phe 
65 70 75 80 

Ser Leu He Thr Leu Gin Asn Pro Val Val Phe Ser Asp Gly Lys Glu 
85 90 95 

Val Ser Val Leu Leu Ala Leu Ala Ala Thr Ser Ser Lys He His Thr 
100 105 HO 

Ser Val Ala He Pro Gin He He Ala Leu Phe Glu Leu Glu Asp Ser 
115 120 125 

He Ala Arg Leu Gin Ala Cys Gin Thr Lys Glu Asp Val Leu Ala Met 
130 135 140 

He Glu Glu Ser Lys Asp Ser Pro Tyr Leu Glu Gly Leu Asp Leu Glu 
145 150 155 160 

Ser 



(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 293 amino acids 

(B) TYPE: amino acid 



10 



15 



30 



45 



60 
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(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

Met Ser Arg Asp lie lie Lys Leu Asp Gin lie Asp Val Thr Phe His 
15 10 15 

Gin Lys Lys Arg Thr lie Thr Ala Val Lys Asp Val Thr He His He 
20 25 30 



Gin Glu Gly Asp He Tyr Gly He Val Gly Tyr Ser Gly Ala Gly Lys 

20 35 40 45 

Ser Thr Leu Val Arg Val He Asn Leu Leu Gin Lys Pro Ser Ala Gly 
50 55 60 

25 Lys He Thr He Asp Asp Asp Val He Phe Asp Gly Lys Val Thr Leu 

65 70 75 80 



Thr Ala Glu Gin Leu Arg Arg Lys Arg Gin Asp He Gly Met He Phe 
85 90 95 

Gin His Phe Asn Leu Met Ser Gin Lys Thr Ala Glu Glu Asn Val Ala 
100 105 110 



Phe Ala Leu Lys His Ser Gly Leu Ser Lys Glu Glu Lys Lys Ala Lys 

35 115 120 125 

Val Ala Lys Leu Leu Asp Leu Val Gly Leu Ala Asp Arg Ala Glu Asn 

130 135 140 

40 Tyr Pro Ser Gin Leu Ser Gly Gly Gin Lys Gin Arg Val Ala He Ala 

145 150 155 160 



Arg Ala Leu Ala Asn Asp Pro Lys He Leu He Ser Asp Glu Ser Thr 

165 170 175 

Ser Ala Leu Asp Pro Lys Thr Thr Lys Gin He Leu Ala Leu Leu Gin 

180 185 190 



Asp Leu Asn Gin Lys Leu Gly Leu Thr Val Val Leu He Thr His Glu 
50 195 200 205 

Met Gin He Val Lys Asp He Ala Asn Arg Val Ala Val Met Gin Asp 
210 215 220 

55 Gly His Leu He Glu Glu Gly Ser Val Leu Glu He Phe Ser Asn Pro 

225 230 235 240 



Lys Gin Pro Leu Thr Gin Asp Phe He Ser Thr Ala Thr Gly He Asp 
245 250 255 

Glu Ala Met Val Lys He Glu Lys Gin Glu He Val Glu His Leu Ser 
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260 



265 



270 



Glu Asn Ser Leu Leu Val Gin Leu Gin Val Arg Trp Ser Phe Asn Arg 
275 280 285 



Arg Ala Thr Phe Glu 
290 



(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 209 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 



(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 8: 

Arc Asp Val Asn Phe Glu He Glu Lys Gly Glu Leu Val He He Leu 
1-5 10 I 5 

Glv Ala Ser Gly Ala Gly Lys Ser Thr Val Leu Asn Leu Leu Gly Gly 
20 25 30 

Met Asp Thr Asn Asp Glu Gly Glu He Trp He Asp Gly Val Asn He 
35 40 45 

Ala Asp Tyr Ser Ser His Gin Arg Thr Asn Tyr Arg Arg Asn Asp Val 
50 55 60 

Glv Phe Val Phe Gin Phe Tyr Asn Leu Val Ser Asn Leu Thr Ala Lys 
65 70 75 80 

Glu Asn Val Glu Leu Ser Glu He Val Thr Asp Ala Leu Asn Ser Asp 
85 90 95 

Gin Val Leu Thr Asp Val Gly Leu Ala His Arg Leu Asn Asn Phe Pro 
100 105 HO 

Ala Gin Leu Ser Gly Gly Glu Gin Gin Arg Val Ser He Ala Arg Ala 
115 120 125 

Val Ala Lys Asn Pro Lys He Leu Leu Cys Asp Glu Pro Thr Gly Ala 
130 135 140 

Leu Asp Tyr Gin Thr Gly Lys Gin Val Leu Lys He Leu Gin Asp Met 
145 150 155 160 

Ser Arg Gin Lys Gly Ala Thr Val He He Val Thr His Asn Gly Ala 
165 170 175 

Leu Ala Pro He Ala Asp Arg Val He Gin Met His Asp Ala Ser Val 
180 185 190 
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Lys Asp Val Val Leu Asn Gin His Pro Gin Asp He Asp Ser Leu Glu 
195 200 205 



Tyr 



(2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

Met He Glu Leu Lys Asn He Thr Lys Thr He Gly Gly Lys Val He 
1 5 10 15 

Leu Asp Asn Leu Ser Leu Arg He Asp Gin Gly Asp Leu Val Ala He 
20 25 30 

Val Gly Lys Ser Gly Ser Gly Lys Ser Thr Leu Leu Asn Leu Leu Gly 
35 40 45 

Leu He Asp Gly Asp Tyr Ser Gly Arg Tyr Glu He Phe Gly Gin Thr 
50 55 60 

Asn Leu Ala Val Asn Ser Ala Lys Ser Gin Thr He He Arg Glu His 
65 70 75 80 

He Ser Tyr Leu Phe Gin Asn Phe Ala Leu He Asp Asp Glu Thr Val 
85 90 95 

Glu Tyr Asn Leu Met Leu Ala Leu Lys Tyr Val Lys Leu Pro Lys Lys 
100 105 HO 

Asp Lys Leu Lys Lys Val Glu Glu He Leu Glu Arg Val Gly Leu Ser 
115 120 125 

Ala Thr Leu His Gin Arg Val Ser Glu Leu Ser Gly Gly Glu Gin Gin 
130 135 140 

Arg He Ala Val Ala Arg Ala He Leu Lys Pro Ser Gin Leu He Leu 
145 150 155 160 

Ala Asp Glu Pro Thr Gly Ser Leu Asp Pro Glu Asn Arg Asp Leu Val 
165 170 175 

Leu Lys Phe Leu Leu Glu Met Asn Arg Glu Gly Lys Thr Val He He 
180 185 190 
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Val Thr His Asp Ala Tyr Val Ala Gin Gin Cys His Arg Val lie Glu 
195 200 205 

Leu Gly Glu Gly Lys 
5 210 

(2) INFORMATION FOR SEQ ID NO: 150 : 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 84 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

15 (ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



20 



35 



40 



55 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 



Ala Lys Pro Lys Gin Leu Ser Gly Gly Gin Lys Gin Arg Val Ala lie 
25 1 5 10 15 

Ala Arg Ala Leu Ser Met Asn Pro Asp Ala lie Leu Phe Asp Glu Pro 
20 25 30 

30 Thr Ser Ala Leu Asp Pro Glu Met Val Gly Glu Val Leu Lys lie Met 

35 40 45 



Gin Asp Leu Ala Gin Glu Gly Leu Thr Met He Val Val Thr His Glu 
50 55 60 

Met Glu Phe Ala Arg Asp Val Ser His Arg Val He Phe Met Asp Lys 
65 70 75 80 

Gly Val He Pro 



(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 118 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

50 (ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 



Tyr Tyr Gly Asp Tyr His Ala Leu Arg Asn He Asn Leu Arg Phe Glu 
60 1 5 10 15 
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Lys Gly Gin Val Val Val Leu Leu Gly Pro Ser Gly Ser Gly Lys Ser 
20 25 30 

Thr Leu lie Arg Thr lie Asn Gly Leu Glu Ala Val Asp Lys Gly Ser 
35 40 45 

Leu Leu Val Asn Gly His Gin Val Ala Gly Ala Ser Gin Lys Asp Leu 
50 55 60 

Val Pro Leu Arg Lys Glu Val Gly Met Val Phe Gin His Phe Asn Leu 
65 70 75 80 

Tyr Pro His Lys Thr Val Leu Glu Asn Val Thr Leu Ala Pro lie Lys 
85 90 95 

Val Leu Gly lie Asp Lys Lys Glu Ala Glu Lys Thr Ala Gin Lys Tyr 
100 105 110 

Leu Glu Phe Val Asn Met 
115 

(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 237 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:152: 

Met Thr Lys Lys Gin Leu His Leu Val lie Val Thr Gly Met Gly Gly 
15 10 15 

Ala Gly Lys Thr Val Ala lie Gin Ser Phe Glu Asp Leu Gly Tyr Phe 
20 25 30 

Thr lie Asp Asn Met Pro Pro Ala Leu Leu Pro Lys Phe Leu Gin Leu 
35 40 45 

Val Glu lie Lys Glu Asp Asn Pro Lys Leu Ala Leu Val Val Asp Met 
50 55 60 

Arg Ser Arg Ser Phe Phe Ser Glu lie Gin Ala Val Leu Asp Glu Leu 
65 70 75 80 

Glu Asn Gin Asp Gly Leu Asp Phe Lys lie Leu Phe Leu Asp Ala Ala 
85 90 95 

Asp Lys Glu Leu Val Ala Arg Tyr Lys Glu Thr Arg Arg Ser His Pro 
100 105 110 

Leu Ala Ala Asp Gly Arg lie Leu Asp Gly lie Lys Leu Glu Arg Glu 
115 120 125 



10 
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Leu Leu Ala Pro Leu Lys Asn Met Ser Gin Asn Val Val Asp Thr Thr 
130 135 140 

Glu Leu Thr Pro Arg Glu Leu Arg Lys Thr Leu Ala Glu Gin Phe Ser 
145 150 155 160 

Asp Gin Glu Gin Ala Gin Ser Phe Arg He Glu Val Met Ser Phe Gly 
165 170 175 

Phe Lys Tyr Gly He Pro He Asp Ala Asp Leu Val Phe Asp Val Arg 
180 185 190 



Phe Leu Pro Asn Pro Tyr Tyr Leu Pro Glu Leu Arg Asn Gin Thr Gly 
15 195 200 205 

Val Asp Glu Pro Val Tyr Asp Tyr Val Met Asn His Pro Glu Ser Glu 
210 215 220 

20 Asp Phe Tyr Gin His Leu Leu Ala Leu He Glu Pro He 

225 230 235 

(2) INFORMATION FOR SEQ ID NO: 153: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

30 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
35 (iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

40 Met Leu Glu Asn Asp He Lys Lys Val Leu Val Ser His Asp Glu He 

1 5 10 15 



45 



Thr Glu Ala Ala Lys Lys Leu Gly Ala Gin Leu Thr Lys Asp Tyr Ala 
20 25 30 

Gly Lys Asn Pro He Leu Val Gly He Leu Lys Gly Ser He Pro Phe 
35 40 45 



Met Ala Glu Leu Val Lys His He Asp Thr His He Glu Met Asp Phe 
50 50 55 60 

Met Met Val Ser Ser Tyr His Gly Gly Thr Ala Ser Ser Gly Val He 
65 70 75 80 

55 Asn He Lys Gin Asp Val Thr Gin Asp He Lys Gly Arg His Val Leu 

85 90 95 



60 



Phe Val Glu Asp He He Asp Thr Gly Gin Thr Leu Lys Asn Leu Arg 
100 105 110 

Asp Met Phe Lys Glu Arg Glu Ala Ala Ser Val Lys He Ala Thr Leu 
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115 120 125 

Leu Asp Lys Pro Glu Gly Arg Val Val Glu lie Glu Ala Asp Tyr Thr 
130 135 140 

Cys Phe Thr He Pro Asn Glu Phe Val Val Gly Tyr Gly Leu Asp Tyr 
145 150 155 160 

Lys Glu Asn Tyr Arg Asn Leu Pro Tyr He Gly Val Leu Lys Glu Glu 
165 170 175 

Val Tyr Ser Asn 
180 

(2) INFORMATION FOR SEQ ID NO: 154 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

Met Lys He Gly He Leu Ala Leu Gin Gly Ala Phe Ala Glu His Ala 
1 5 10 15 

Lys Val Leu Asp Gin Leu Gly Val Glu Ser Val Glu Leu Arg Asn Leu 
20 25 30 

Asp Asp Phe Gin Gin Asp Gin Ser Asp Leu Ser Gly Leu He Leu Pro 
35 40 45 

Gly Gly Glu Ser Thr Thr Met Gly Lys Leu Leu Arg Asp Gin Asn Met 
50 55 60 

Leu Leu Pro He Arg Glu Ala He Leu Ser Gly Leu Pro Val Phe Gly 
65 70 75 80 

Thr Cys Ala Gly Leu He Leu Leu Ala Lys Glu He Thr Ser Gin Lys 
85 90 95 

Glu Ser His Leu Gly Thr Met Asp Met Val Val Glu Arg Asn Ala Tyr 
100 105 110 

Gly Arg Gin Leu Gly Ser Phe Tyr Thr Glu Ala Glu Cys Lys Gly Val 
115 120 125 

Gly Lys He Pro Met Thr Phe He Arg Gly Pro He He Ser Ser Val 
130 135 140 

Gly Glu Gly Val Glu He Leu Ala He Val Asn Asn Gin He Val Ala 
145 150 155 160 
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Ala Gin Glu Lys Asn Met Leu Val Ser Ser Phe His Pro Glu Leu Thr 
165 170 175 

Asp Asp Val Arg Leu His Gin Tyr Phe lie Asn Met Cys Lys Glu Lys 
5 180 185 190 

Ser 

10 (2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 189 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

20 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:155: 

Glu Ser Glu Val Leu Ser Pro Ala Asp Asp Arg Phe His Val Asp Lys 
15 10 15 

Lys Glu Phe Gin Val Pro Phe Val Cys Gly Ala Lys Asp Leu Gly Glu 
20 25 30 

Ala Leu Arg Arg lie Ala Glu Gly Ala Ser Met lie Arg Thr Lys Gly 
35 35 40 45 

Glu Pro Gly Thr Gly Asp He Val Gin Ala Val Arg His Met Arg Met 
50 55 60 

40 Met Asn Gin Glu He Arg Arg He Gin Asn Leu Arg Glu Asp Glu Leu 

65 70 75 80 



30 



45 



Tyr Val Ala Ala Lys Asp Leu Gin Val Pro Val Glu Leu Val Gin Tyr 
85 90 95 

Val His Glu His Gly Lys Leu Pro Val Val Asn Phe Ala Ala Gly Gly 
100 105 110 



Val Ala Thr Pro Ala Asp Ala Ala Leu Met Met Gin Leu Gly Ala Glu 

50 115 120 125 

Gly Val Phe Val Gly Ser Gly He Phe Lys Ser Gly Asp Pro Val Lys 

130 135 140 

55 Arg Ala Ser Ala He Val Lys Ala Val Thr Asn Phe Arg Asn Pro Gin 

145 150 155 160 



60 



He Leu Ala Gin He Ser Glu Asp Leu Gly Glu Ala Met Val Gly lie 
165 170 175 

Asn Glu Asn He Gin He Leu Met Ala Glu Arg Gly Lys 
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180 185 
(2) INFORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 162 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



Gly Tyr Glu Asn Lys 
15 



Tyr Asn Met Leu Asp 
30 

Val Gin Val Glu Lys 
45 

Phe Pro Gly Tyr Val 
60 



Phe Val Val Arg Asn 
80 



Thr Ala Tyr Glu lie 
95 

Arg Asn Lys Ala Ala 
110 

Leu lie Ala Glu Ala 
125 

lie Lys Ala Leu Tyr 
140 

Gly lie Ser Cys Thr 
160 

lie Leu 



(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 208 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

Asp Lys Gly Trp Phe Val Leu Gin Thr Tyr Ser 
15 10 

Val Lys Glu Asn Leu Leu Gin Arg Ala Gin Thr 
20 25 

Asn lie Leu Arg Val Glu lie Pro Thr Gin Thr 
35 40 

Asn Gly Lys Arg Lys Glu Val Glu Glu Asn Arg 
50 55 

Leu Val Glu Met Val Met Thr Asp Glu Ala Trp 
65 70 75 

Ala Gin Ser Pro Thr Lys Phe lie Ser Glu Gin 
85 90 

Asp Glu Glu Val Arg Ser Leu Leu Asn Glu Ala 
100 105 

Glu lie lie Gin Ser Asn Arg Glu Thr His Lys 
115 120 

Leu Leu Lys Tyr Glu Thr Leu Asp Ser Thr Gin 
130 135 

Glu Thr Gly Lys Met Pro Glu Ser Ser Arg Arg 
145 150 155 



10 



15 



30 



40 



45 



60 
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(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL : NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 



Val Asn Ser Ser Ser Val Pro Gly Asp Arg Phe Ser Val Leu Leu Glu 
15 10 15 

His Lys Gly He His Pro He Val Tyr He Ser Lys Met Asp Leu Leu 
20 25 30 

Glu Asp Arg Gly Glu Leu Asp Phe Tyr Gin Gin Thr Tyr Gly Asp He 
20 35 40 45 

Gly Tyr Asp Phe Val Thr Ser Lys Glu Glu Leu Leu Ser Leu Leu Thr 
50 55 60 

25 Gly Lys Val Thr Val Phe Met Gly Gin Thr Gly Val Gly Lys Ser Thr 

65 70 75 80 



Leu Leu Asn Lys He Ala Pro Asp Leu Asn Leu Glu Thr Gly Glu He 
85 90 95 

Ser Asp Ser Leu Gly Arg Gly Arg His Thr Thr Arg Ala Val Ser Phe 
100 105 110 



Tyr Asn Leu Asn Gly Gly Lys He Ala Asp Thr Pro Gly Phe Ser Ser 
35 115 120 125 



Leu Asp Tyr Glu Val Ser Arg Ala Glu Asp Leu Asn Gin Ala Phe Pro 
130 135 140 

Glu He Ala Thr Val Ser Arg Asp Cys Lys Phe Arg Thr Cys Thr His 
145 150 155 160 

Thr His Glu Pro Ser Cys Ala Val Lys Pro Ala Val Glu Glu Gly Val 
165 170 175 

He Ala Thr Phe Arg Phe Asp Asn Tyr Leu Gin Phe Leu Ser Glu He 
180 185 190 



Glu Asn Arg Arg Glu Thr Tyr Lys Lys Val Ser Lys Lys He Pro Lys 
50 195 200 205 

(2) INFORMATION FOR SEQ ID NO:158: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 403 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: peptide 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

Gin Gin Ser Val Lys Lys Lys Val Leu Pro Ala lie Glu Arg Arg lie 
15 10 15 

Arg Thr Glu Leu Thr Glu Lys Ala Glu Glu Gly Ala lie Gin Leu Phe 
20 25 30 

Ser Asp Asn Leu Arg Asn Leu Leu Leu Val Ala Pro Leu Lys Gly Arg 
35 40 45 

Val Val Leu Gly Phe Asp Pro Ala Phe Arg Thr Gly Ala Lys Leu Ala 
50 55 60 

Val Val Asp Ala Thr Gly Lys Met Leu Thr Thr Gin Val He Tyr Pro 
65 70 75 80 

Val Lys Pro Ala Ser Ala Arg Gin He Glu Glu Ala Lys Lys Asp Leu 
85 90 95 

Ala Asp Leu He Gly Gin Tyr Gly Val Glu He He Ala He Gly Asn 
100 105 110 

Gly Thr Ala Ser Arg Glu Ser Glu Ala Phe Val Ala Glu Val Leu Lys 
115 120 125 

Asp Phe Pro Glu Val Ser Tyr Val He Val Asn Glu Ser Gly Ala Ser 
130 135 140 

Val Tyr Ser Ala Ser Glu Leu Ala Arg Gin Glu Phe Pro Asp Leu Thr 
145 150 155 160 

Val Glu Lys Arg Ser Ala He Ser He Ala Arg Arg Leu Gin Asp Pro 
165 170 175 

Leu Ala Glu Leu Val Lys He Asp Pro Lys Ser He Gly Val Gly Gin 
180 185 190 

Tyr Gin His Asp Val Ser Gin Lys Lys Leu Ser Glu Ser Leu Asp Phe 
195 200 205 

Val Val Asp Thr Val Val Asn Gin Val Gly Val Asn Val Asn Thr Ala 
210 215 220 

Ser Pro Ala Leu Leu Ser His Val Ala Gly Leu Asn Lys Thr He Ser 
225 230 235 240 

Glu Asn He Val Lys Tyr Arg Glu Glu Glu Gly Lys He Thr Ser Arg 
245 250 255 

Ala Gin He Lys Lys Val Pro Arg Leu Gly Ala Lys Ala Phe Glu Gin 
260 265 270 

Ala Ala Gly Phe Leu Arg He Pro Glu Ser Ser Asn He Leu Asp Asn 
275 280 285 
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Thr Gly Val His Pro Glu Asn Tyr Thr Ala Val Lys Leu Phe Lys Axg 

290 295 300 

Leu Asp lie Lys Asp Leu Asn Glu Glu Ala Ser Lys Leu Lys Ser Leu 

305 310 315 320 



Ser Val Lys 
Lys Asp lie 



Ser Phe Asp 
355 

Leu Val Val 
370 

Phe Gly Ala 
385 



Glu Met Ala Gin Glu Leu Asp 
325 330 

lie Ala Asp Leu Leu Lys Pro 
340 345 

Ala Pro Val Leu Arg Gin Asp 
360 

Gly Gin Lys Leu Glu Gly Val 
375 

Phe Val Asp He Gly Val His 
390 



Leu Gly Pro 
Gly Arg Asp 



Val Leu Asp 
365 

Val Arg Asn 
380 

Glu Asp Gly 
395 



Glu Thr Leu 
335 

Phe Arg Asp 
350 

He Lys Asp 
Val Val Asp 



Leu He His 
400 



He Leu He 



(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 179 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

<ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

Met Phe Arg Ala Ala Met Ala Asn Gin Thr Glu Met Gly Val Leu Ala 
15 10 15 

Lys Ser Tyr He Asp Lys Gly Glu Leu Val Pro Asp Glu Val Thr Asn 
20 25 30 

Gly He Val Lys Glu Arg Leu Ser Gin Asp Asp He Lys Glu Thr Gly 
35 40 45 

Phe Leu Leu Asp Gly Tyr Pro Arg Thr He Glu Gin Ala His Ala Leu 
50 55 60 

Asp Lys Thr Leu Ala Glu Leu Gly He Glu Leu Glu Gly He He Asn 
65 70 75 80 

He Glu Val Asn Pro Asp Ser Leu Leu Glu Arg Leu Ser Gly Arg He 
85 90 95 . 



10 



50 
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Ile His Arg Val Thr Gly Glu Thr Phe His Lys Val Phe Asn Pro Pro 
100 105 110 

Val Asp Tyr Lys Glu Glu Asp Tyr Tyr Gin Arg Glu Asp Asp Lys Pro 
115 120 125 

Glu Thr Val Lys Arg Arg Leu Asp Val Asn lie Ala Gin Gly Glu Pro 
130 135 140 

He He Ala His Tyr Arg Ala Lys Gly Leu Val His Asp He Glu Gly 
145 150 155 160 



Asn Gin Asp He Asn Asp Val Phe Ser Asp He Glu Lys Val Leu Thr 
15 165 170 175 

Asn Leu Lys 

20 (2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

30 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160 : 

Met He Glu Phe Glu Lys Pro Asn He Thr Lys He Asp Glu Asn Lys 
40 1 5 10 15 

Asp Tyr Gly Lys Phe Val He Glu Pro Leu Glu Arg Gly Tyr Gly Thr 
20 25 30 

45 Thr Leu Gly Asn Ser Leu Arg Arg Val Leu Leu Ala Ser Leu Pro Gly 

35 40 45 

Ala Ala Val Thr Ser He Asn He Asp Gly Val Leu His Glu Phe Asp 
50 55 60 

Thr Val Pro Gly Val Arg Glu Asp Val Met Gin He He Leu Asn He 
65 70 75 80 



Lys Gly He Ala Val Lys Ser Tyr Val Glu Asp Glu Lys He He Glu 

55 85 90 95 

Leu Asp Val Glu Gly Pro Ala Glu Val Thr Ala Gly Asp He Leu Thr 

100 105 110 

60 Asp Ser Asp He Glu He Val Asn Pro Asp His Tyr Leu Phe Thr He 

115 120 125 



10 



15 



25 



30 



45 



60 
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Gly Glu Gly Ser Ser Leu Lys Ala Thr Met Thr Val Asn Ser Gly Arg 
130 135 140 

Gly Tyr Val Pro Ala Asp Glu Asn Lys Lys Asp Asn Ala Pro Val Gly 
145 150 155 160 

Thr Leu Ala Val Asp Ser lie Tyr Thr Pro Val Thr Lys Val Asn Tyr 
165 170 175 

Gin Val Glu Pro Ala Arg Val Gly Ser Asn Asp Gly Phe Asp Ser 
180 185 190 

(2) INFORMATION FOR SEQ ID NO: 161: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 335 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 
20 (D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161 : 

Glu Tyr Leu Gly Ala Thr Val Gin Val He Pro His He Thr Asp Ala 
15 10 15 



Leu Lys Glu Lys He Lys Ser Ala Ala Leu Thr Thr Asp Ser Asp Val 
35 20 25 30 

He He Thr Glu Val Gly Gly Thr Val Gly Asp He Glu Ser Leu Pro 
35 40 45 

40 Phe Leu Glu Ala Leu Arg Gin Met Lys Ala Asp Val Gly Ala Asp Asn 

50 55 60 



Val Met Tyr He His Thr Thr Leu Pro Tyr Leu Lys Ala Ala Gly Glu 

65 70 75 80 

Met Lys Lys Pro Thr Gin His Ser Val Lys Leu Arg Gly Leu Gly lie 

85 90 95 



Gin Pro Asn Met Leu Val He Arg Thr Glu Glu Pro Ala Gly Gin Gly 

50 100 105 110 

He Lys Asn Lys Leu Ala Gin Phe Cys Asp Val Ala Pro Glu Ser Leu 
115 120 125 

55 He Glu Ser Leu Asp Val Glu His Leu Tyr Gin He Pro Leu Asn Leu 

130 135 140 



Gin Ala Gin Gly Met Asp Gin He Val Cys Asp His Leu Lys Leu Asp 
145 150 155 160 

Ala Pro Ala Ala Asp Met Thr Glu Trp Ser Ala Met Val Asp Lys Val 
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165 170 175 

Met Asn Leu Lys Lys Gin Val Lys lie Ser Leu Val Gly Lys Tyr Val 
180 185 190 

Glu Leu Gin Asp Ala Tyr lie Ser Val Val Glu Ala Leu Lys His Ser 
195 200 205 

Gly Tyr Val Asn Asp Val Glu Val Lys lie Asn Trp Val Asn Ala Asn 
210 215 220 

Asp Val Thr Ala Glu Asn Val Ala Glu Leu Leu Ser Asp Ala Asp Gly 
225 230 235 240 

lie lie Val Pro Gly Gly Phe Gly Gin Arg Gly Thr Glu Gly Lys lie 
245 250 255 

Gin Ala lie Arg Tyr Ala Arg Glu Asn Asp Val Pro Met Leu Gly Val 
260 265 270 

Cys Leu Gly Met Gin Leu Thr Cys lie Glu Phe Ala Arg His Val Leu 
275 280 285 

Gly Leu Glu Gly Ala Asn Ser Ala Glu Leu Ala Pro Glu Thr Lys Tyr 
290 295 300 

Pro lie lie Asp lie Met Arg Asp Gin lie Asp lie Glu Asp Met Gly 
305 310 315 320 

Gly Thr Leu Arg Leu Gly Leu Tyr Pro Ser Lys Leu Lys Arg Leu 
325 330 335 

(2) INFORMATION FOR SEQ ID NO: 162 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 301 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 

Met Ser Glu Lys Leu Val Glu lie Lys Asp Leu Glu lie Ser Phe Gly 
15 10 15 

Glu Gly Ser Lys Lys Phe Val Ala Val Lys Asn Ala Asn Phe Phe lie 
20 25 30 

Asn Lys Gly Glu Thr Phe Ser Leu Val Gly Glu Ser Gly Ser Gly Lys 
35 40 45 

Thr Thr lie Gly Arg Ala lie lie Gly Leu Asn Asp Thr Ser Asn Gly 
50 55 60 
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Asp He He Phe Asp Gly Gin Lys He Asn Gly Lys Lys Ser Arg Glu 
65 70 75 80 

Gin Ala Ala Glu Leu He Arg Arg He Gin Met He Phe Gin Asp Pro 
85 90 95 

Ala Ala Ser Leu Asn Glu Arg Ala Thr Val Asp Tyr He He Ser Glu 
100 105 HO 

Gly Leu Tyr Asn His Arg Leu Phe Lys Asp Glu Glu Glu Arg Lys Glu 
115 120 125 

Lys Val Gin Ser He He Arg Glu Val Gly Leu Leu Ala Glu His Leu 
130 135 140 

Thr Arg Tyr Pro His Glu Phe Ser Gly Gly Gin Arg Gin Arg He Gly 
145 150 155 160 

He Ala Arg Ala Leu Val Met Gin Pro Asp Phe Val He Ala Asp Glu 
165 170 175 

Pro He Ser Ala Leu Asp Val Ser Val Arg Ala Gin Val Leu Asn Leu 
180 185 190 

Leu Lys Lys Phe Gin Lys Glu Leu Gly Leu Thr Tyr Leu Phe He Ala 
195 200 205 

His Asp Leu Ser Val Val Arg Phe He Ser Asp Arg He Ala Val He 
210 215 220 

Tyr Lys Gly Val He Val Glu Val Ala Glu Thr Glu Glu Leu Phe Asn 
225 230 235 240 

Asn Pro He His Pro Tyr Thr Gin Ala Leu Leu Ser Ala Val Pro He 
245 250 255 

Pro Asp Pro He Leu Glu Arg Lys Lys Val Leu Lys Val Tyr Asp Pro 
260 265 270 

Ser Gin His Asp Tyr Glu Thr Asp Lys Pro Ser Met Val Glu He Arg 
275 280 285 

Pro Gly His Tyr Val Trp Ala Asn Gin Ala Glu Leu Ala 
290 295 300 

(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(Ai LENGTH: 151 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



WO 98/26072 



PCT/US97/22578 



-250- 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 

Gin lie Gin Lys Ser Phe Lys Gly Gin Ser Pro Tyr Gly Lys Leu Tyr 
15 10 15 

Leu Val Ala Thr Pro lie Gly Asn Leu Asp Asp Met Thr Phe Arg Ala 
20 25 30 

lie Gin Thr Leu Lys Glu Val Asp Trp lie Ala Ala Glu Asp Thr Arg 
35 40 45 

Asn Thr Gly Leu Leu Leu Lys His Phe Asp lie Ser Thr Lys Gin lie 
50 55 60 

Ser Phe His Glu His Asn Ala Lys Glu Lys lie Pro Asp Leu lie Gly 
65 70 75 80 

Phe Leu Lys Ala Gly Gin Ser lie Ala Gin Val Ser Asp Ala Gly Leu 
85 90 . 95 

Pro Ser lie Ser Asp Pro Gly His Asp Leu Val Lys Ala Ala lie Glu 
100 105 110 

Glu Glu He Ala Val Val Thr Val Pro Gly Ala Ser Ala Gly He Ser 
115 120 125 

Ala Leu He Ala Ser Gly Leu Ala Pro Gin Pro His He Phe Tyr Gly 
130 135 140 

Phe Leu Pro Arg Lys Ser Gly 
145 150 

(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 258 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164 : 

Ser Arg Lys Asp Lys Gin Glu Arg He Ser Lys Glu Thr Met Glu He 
15 10 15 

Tyr Ala Pro Leu Ala His Arg Leu Gly He Ser Ser Val Lys Trp Glu 
20 25 30 

Leu Glu Asp Leu Ser Phe Arg Tyr Leu Asn Pro Thr Glu Phe Tyr Lys 
35 40 45 
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Ile Thr His Met Met Lys Glu Lys Arg Arg Glu Arg Glu Ala Leu Val 
50 55 60 

Asp Glu Val Val Thr Lys Leu Glu Glu Tyr Thr Thr Glu Arg His Leu 
5 65 70 75 80 

Lys Gly Lys lie Tyr Gly Arg Pro Lys His lie Tyr Ser lie Phe Arg 
85 90 95 

10 Lys Met Gin Asp Lys Arg Lys Arg Phe Glu Glu lie Tyr Asp Leii lie 

100 105 110 



15 



30 



60 



Ala lie Arg Cys lie Leu Asp Thr Gin Ser Asp Val Tyr Ala Met Leu 
115 120 125 

Gly Tyr Val His Glu Phe Trp Lys Pro Met Pro Gly Arg Phe Lys Asp 
130 135 140 



Tyr lie Ala Asn Arg Lys Ala Asn Gly Tyr Gin Ser lie His Thr Thr 
20 145 150 155 160 

Val Tyr Gly Pro Lys Gly Pro lie Glu Phe Gin He Arg Thr Lys Glu 
165 170 175 

25 Met His Glu Val Ala Glu Tyr Gly Val Ala Ala His Trp Ala Tyr Lys 

180 185 190 



Lys Gly He Lys Gly Gin Val Asn Ser Lys Glu Ser Ala He Gly Met 
195 200 205 

Asn Trp He Lys Glu Met Met Glu Leu Gin Asp Gin Ala Asp Asp Ala 
210 215 220 



Lys Glu Phe Val Asp Ser Val Lys Glu Asn Tyr Leu Ala Glu Glu He 
35 225 230 235 240 

Thr Val Leu Pro Gin Met Glu Leu Ser Val Pro Ser Gin Arg Phe Arg 
245 250 255 

40 Thr Asp 

(2) INFORMATION FOR SEQ ID NO: 165: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 289 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

50 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
55 (iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

Thr Lys Val Gly Gly Glu Ala Asp Tyr Leu Val Phe Pro Arg Asn Arg 
15 10 15 
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Phe Glu Leu Ala Arg Val Val Lys Phe Ala Asn Gin Glu Asn lie Pro 
20 25 30 

Trp Met Val Leu Gly Asn Ala Ser Asn He lie Val Arg Asp Gly Gly 
35 40 45 

He Arg Gly Phe Val He Leu Cys Asp Lys Leu Asn Asn Val Ser Val 
50 55 60 

Asp Gly Tyr Thr He Glu Ala Glu Ala Gly Ala Asn Leu He Glu Thr 
65 70 75 80 

Thr Arg He Ala Leu Arg His Ser Leu Thr Gly Phe Glu Phe Ala Cys 
85 90 95 

Gly He Pro Gly Ser Val Gly Gly Ala Val Phe Met Asn Ala Gly Ala 
100 105 110 

Tyr Gly Gly Glu He Ala His He Leu Gin Ser Cys Lys Val Leu Thr 
115 120 125 

Lys Asp Gly Glu He Glu Thr Leu Ser Ala Lys Asp Leu Ala Phe Gly 
130 135 140 

Tyr Arg His Ser Ala He Gin Glu Ser Gly Ala Val Val Leu Ser Val 
145 150 155 160 

Lys Phe Ala Leu Ala Pro Gly Thr His Gin Val He Lys Gin Glu Met 
165 170 175 

Asp Arg Leu Thr His Leu Arg Glu Leu Lys Gin Pro Leu Glu Tyr Pro 
180 185 190 

Ser Cys Gly Ser Val Phe Lys Arg Pro Val Gly His Phe Ala Gly Gin 
195 200 205 

Leu He Ser Glu Ala Gly Leu Lys Gly Tyr Arg He Gly Gly Val Glu 
210 215 220 

Val Ser Glu Lys His Ala Gly Phe Met He Asn Val Ala Asp Gly Thr 
225 230 235 240 

Ala Lys Asp Tyr Glu Asp Leu He Gin Ser Val He Glu Lys Val Lys 
245 250 255 

Glu His Ser Gly He Thr Leu Glu Arg Glu Val Arg He Leu Gly Glu 
260 265 270 

Ser Leu Ser Val Ala Lys Met Tyr Ala Gly Gly Phe Thr Pro Cys Lys 
275 280 285 



Arg 



(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 114 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 
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(D) TOPOLOGY: not relevant 
(ii) MOLECULE TYPE: peptide 
5 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

Ala Lys Arg Arg Lys Leu Val Lys Ser Thr Thr Leu Leu Leu Ala Cys 
15 10 15 

15 Leu Gin Lys Pro Phe Leu Thr Thr Leu Leu Pro Thr lie Trp lie Cys 

20 25 30 



20 



35 



45 



50 



55 



Val Lys Ser Ser Met Phe Thr Leu Leu Arg Leu Asn Thr Trp lie Lys 
35 40 45 

Asp Phe His Ser Pro Ser Ser Cys Val Val Thr Phe Gin Lys Ala Phe 
50 55 60 



Thr Asn Gly Arg Gly Lys He Asn Lys Arg His Val Thr Cys Pro Ser 
25 65 70 75 80 

Phe Val Thr Met Pro Leu Thr Arg Glu Ser Ser Leu Ser Thr Thr Ser 
85 90 95 

30 Val Pro Leu Gin Met Thr Val Glu Lys Ser Ala Pro Thr Asn Val Lys 

100 105 110 



Ala Val 



(2) INFORMATION FOR SEQ ID NO:167: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 253 amino acids 
40 (B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 



Met Leu Lys Gin Glu Lys Leu Ala Lys He Leu Glu He Val Asn Ser 
15 10 15 

Lys Gly Thr He Thr Val Lys Gin He Met Asp Glu He Ala Val Ser 
20 25 30 

Asp Met Thr Ala Arg Arg Tyr Leu Gin Glu Leu Ala Asp Lys Asp Leu 
60 35 40 45 
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Leu lie Arg Val His Gly Gly Ala Glu Lys Leu Arg Thr Asn Ser Leu 
50 55 60 

Leu Thr Asn Glu Arg Ser Asn lie Glu Lys Gin Ala Leu Gin Thr Ala 
5 65 70 75 80 

Glu Lys Gin Glu He Ala His Phe Ala Gly Ser Leu Val Glu Glu Arg 
85 90 95 

10 Glu Thr He Phe He Gly Pro Gly Thr Thr Leu Glu Phe Phe Ala Arg 

100 105 110 



15 



30 



Glu Leu Pro lie Asp Asn He Arg Val Val Thr Asn Ser Leu Pro Val 
115 120 125 

Phe Leu He Leu Ser Glu Arg Lys Leu Thr Asp Leu He Leu He Gly 
130 135 140 



Gly Asn Tyr Arg Asp He Thr Gly Ala Phe Val Gly Thr Leu Thr Leu 

20 145 150 155 160 

Gin Asn Leu Ser Asn Leu Gin Phe Ser Lys Ala Phe Val Ser Cys Asn 

165 170 175 

25 Gly He Gin Asn Gly Ala Leu Ala Thr Phe Ser Glu Glu Glu Gly Glu 

180 185 190 



Ala Gin Arg He Ala Leu Asn Asn Ser Asn Lys Lys Tyr Leu Leu Ala 
195 200 205 

Asp His Ser Lys Phe Asn Lys Phe Asp Phe Tyr Thr Phe Tyr Asn He 
210 215 220 



Ser Asn Leu Asp Thr He Val Ser Asp Ser Lys Leu Ser Asp Ser He 
35 225 230 235 240 

Leu Phe Lys Leu Ser Lys His He Lys Val He Lys Pro 
245 a 250 

40 (2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 320 amino acids 

(B) TYPE: amino acid 

45 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

50 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

Met Glu Thr Tyr Tyr Lys Ala He Asn Trp Asn Ala He Glu Asp Val 
15 10 15 

60 He Asp Lys Ser Thr Trp Glu Lys Leu Thr Glu Gin Phe Trp Leu Asp 

20 25 30 
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Thr Arg He Pro Leu Ser Asn Asp Leu Asp Asp Trp Arg Lys Leu Ser 
35 40 45 

Asn Lys Glu Lys Asp Leu Val Gly Lys Val Phe Gly Gly Leu Thr Leu 
50 55 60 

Leu Asp Thr Met Gin Ser Glu Thr Gly Val Gin Ala Leu Arg Ala Asp 
65 70 75 80 

He Arg Thr Pro His Glu Glu Ala Val Phe Asn Asn He Gin Phe Met 
85 90 95 

Glu Ser Val His Ala Lys Ser Tyr Ser Ser He Phe Ser Thr Leu Asn 
100 105 110 

Thr Lys Ala Glu He Glu Glu He Phe Glu Trp Thr Asn Thr Asn Pro 
115 120 125 

Tyr Leu Gin Lys Lys Ala Glu He Val Asn Glu He Tyr Leu Asn Gly 
130 135 140 

Ser Pro Leu Glu Lys Lys Val Ala Ser Val Phe Leu Glu Thr Phe Leu 
145 150 155 160 

Phe Tyr Ser Gly Phe Phe Thr Pro Leu Tyr Tyr Leu Gly Asn Asn Lys 
165 170 175 

Leu Ala Asn Val Ala Glu He He Lys Leu He He Arg Asp Glu Ser 
180 185 190 

Val His Gly Thr Tyr He Gly Tyr Lys Phe Gin Leu Gly Phe Asn Glu 
195 200 205 

Leu Pro Glu Glu Glu Gin Glu Lys Leu Lys Glu Trp Met Tyr Asp Leu 
210 215 220 

Leu Tyr Thr Leu Tyr Glu Asn Glu Glu Gly Tyr Thr Glu Ser Leu Tyr 
225 230 235 240 

Asp Gly Val Gly Trp Thr Glu Glu Val Lys Thr Phe Leu Arg Tyr Asn 
245 250 255 

Ala Asn Lys Ala Leu Met Asn Met Gly Gin Asp Pro Leu Phe Pro Asp 
260 265 270 

Ser Ala Glu Asp Val Asn Pro He Val Met Asn Gly He Ser Thr Gly 
275 280 285 

Thr Ser Asn His Asp Phe Phe Ser Gin Val Gly Asn Gly Tyr Leu Leu 
290 295 300 

Gly Glu Val Glu Ala Met Gin Asp Asp Asp Tyr Asn Tyr Gly Leu Asp 
305 310 315 320 



(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 

(B) TYPE: amino acid 



10 



15 



30 



45 



60 
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(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

<ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 



He Glu Glu Gly Val Lys Val Val Thr Thr Gly Ala Gly Asn Pro Ser 
15 10 15 

Lys Tyr Met Glu Arg Phe His Glu Ala Gly He He Val He Pro Val 
20 25 30 

Val Pro Ser Val Ala Leu Ala Lys Arg Met Glu Lys He Gly Ala Asp 
20 35 40 45 

Ala Val He Ala Glu Gly Met Glu Ala Gly Gly His He Gly Lys Leu 
50 55 60 

25 Thr Thr Met Thr Leu Val Arg Gin Val Ala Thr Ala Val Ser He Pro 

65 70 75 80 



Val He Ala Ala Gly Gly He Ala Asp Gly Glu Gly Ala Ala Ala Gly 
85 90 95 

Phe Met Leu Gly Ala Glu Ala Val Gin Val Gly Thr Arg Phe Val Val 
100 105 110 



Ala Lys Glu Ser Asn Ala His Pro Asn Tyr Lys Glu Lys He Leu Lys 
35 115 120 125 

Ala Arg Asp He Asp Thr Thr He Ser Ala Gin His Phe Gly His Ala 
130 135 140 

40 Val Arg Ala He Lys Asn Gin Leu Thr Arg Asp Phe Glu Leu Ala Glu 

145 150 155 160 



Lys Asp Ala Phe Lys Gin Glu Asp Pro Asp Leu Glu He Phe Glu Gin 
165 170 175 

Met Gly Ala Gly Ala Leu Ala Lys Ala Val Val His Gly Asp Val Glu 
180 185 190 



Gly Gly Ser Val Met Ala Gly Gin He Ala Gly Leu Val Ser Lys Glu 
50 195 200 205 

Glu Thr Ala Glu Glu He Leu Lys Asp Leu Tyr Tyr Gly Ala Ala Lys 
210 215 220 

55 Lys He Gin Glu Glu Ala Ser Arg Trp Thr Gly Val Val Arg Asn Asp 

225 230 235 240 



(2) INFORMATION FOR SEQ ID NO: 170: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 243 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

Met Lys Leu Glu His Lys Asn lie Phe lie Thr Gly Ser Ser Arg Gly 
15 10 15 

lie Gly Leu Ala lie Ala His Lys Phe Ala Gin Ala Gly Ala Asn He 
20 25 30 

Val Leu Asn Ser Arg Gly Ala He Ser Glu Glu Leu Leu Ala Glu Phe 
35 40 45 

Ser Asn Tyr Gly He Lys Val Val Pro He Ser Gly Asp Val Ser Asp 
50 55 60 

Phe Ala Asp Ala Lys Arg Met lie Asp Gin Ala He Ala Glu Leu Gly 
65 70 75 80 

Ser Val Asp Val Leu Val Asn Asn Ala Gly He Thr Gin Asp Thr Leu 
85 90 95 

Met Leu Lys Met Thr Glu Ala Asp Phe Glu Lys Val Leu Lys Val Asn 
100 105 110 

Leu Thr Gly Ala Phe Asn Met Thr Gin Ser Val Leu Lys Pro Met Met 
115 120 125 

Lys Ala Arg Glu Gly Ala He He Asn Met Ser Ser Val Val Gly Leu 
130 135 140 

Met Gly Asn He Gly Gin Ala Asn Tyr Ala Ala Ser Lys Ala Gly Leu 
145 150 155 160 

He Gly Phe Thr Lys Ser Val Ala Arg Glu Val Ala Ser Arg Asn He 
165 170 175 

Arg Val Asn Val He Ala Pro Gly Met lie Glu Ser Asp Met Thr Ala 
180 185 190 

He Leu Ser Asp Lys lie Lys Glu Ala Thr Leu Ala Gin lie Pro Met 
195 200 205 

Lys Glu Phe Gly Gin Ala Glu Gin Val Ala Asp Leu Thr Val Phe Leu 
210 215 220 

Ala Gly Gin Asp Tyr Leu Thr Gly Gin Val lie Ala lie Asp Gly Gly 
225 230 235 240 

Leu Ser Met 
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(2) INFORMATION FOR SEQ ID NO: 171 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 306 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

Met Thr Lys Thr Ala Phe Leu Phe Ala Gly Gin Gly Ala Gin Tyr Leu 
15 10 15 

Gly Met Gly Arg Asp Phe Tyr Asp Gin Tyr Pro lie Val Lys Glu Thr 
20 25 30 

He Asp Arg Ala Ser Gin Val Leu Gly Tyr Asp Leu Arg Tyr Leu He 
35 40 45 

Asp Thr Glu Glu Asp Lys Leu Asn Gin Thr Arg Tyr Thr Gin Pro Ala 
50 55 60 

He Leu Ala Thr Ser Val Ala He Tyr Arg Leu Leu Gin Glu Lys Gly 
65 70 75 80 

Tyr Gin Pro Asp Met Val Ala Gly Leu Ser Leu Gly Glu Tyr Ser Ala 
85 90 95 

Leu Val Ala Ser Gly Ala Leu Asp Phe Glu Asp Ala Val Ala Leu Val 
100 105 HO 

Ala Lys Arg Gly Ala Tyr Met Glu Glu Ala Ala Pro Ala Asp Ser Gly 
115 120 125 

Lys Met Val Ala Val Leu Asn Thr Pro Val Glu Val He Glu Glu Ala 
130 135 140 

Cys Gin Lys Ala Ser Glu Leu Gly Val Val Thr Pro Ala Asn Tyr Asn 
145 150 155 160 

Thr Pro Ala Gin He Val He Ala Gly Glu Val Val Ala Val Asp Arg 
165 170 175 

Ala Val Glu Leu Leu Gin Glu Ala Gly Ala Lys Arg Leu He Pro Leu 
180 185 190 

Lys Val Ser Gly Pro Phe His Thr Ser Leu Leu Glu Pro Ala Ser Gin 
195 200 205 

Lys Leu Ala Glu Thr Leu Ala Gin Val Ser Phe Ser Asp Phe Thr Cys 
210 215 220 

Pro Leu Val Gly Asn Thr Glu Ala Ala Val Met Gin Lys Glu Asp He 
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225 230 235 240 

Ala Gin Leu Leu Thr Arg Gin Val Lys Glu Pro Val Arg Phe Tyr Glu 
245 250 255 

5 

Ser lie Gly Val Met Gin Glu Ala Gly He Ser Asn Phe He Glu He 
260 265 270 

Gly Pro Gly Lys Val Leu Ser Gly Phe Val Lys Lys He Asp Gin Thr 
10 275 280 285 

Ala His Leu Ala His Val Glu Asp Gin Ala Ser Leu Val Ala Leu Leu 
290 295 300 

15 Glu Lys 

305 

(2) INFORMATION FOR SEQ ID NO: 172: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 318 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS: not relevant 
(D) TOPOLOGY: not relevant 

25 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
30 (iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

35 Met Lys Leu Asn Arg Val Val Val Thr Gly Tyr Gly Val Thr Ser Pro 

15 10 15 



40 



55 



He Gly Asn Thr Pro Glu Glu Phe Trp Asn Ser Leu Ala Thr Gly Lys 

20 25 30 

He Gly He Gly Gly He Thr Lys Phe Asp His Ser Asp Phe Asp Val 
35 40 45 



His Asn Ala Ala Glu He Gin Asp Phe Pro Phe Asp Lys Tyr Phe Val 

45 50 55 60 

Lys Lys Asp Thr Asn Arg Phe Asp Asn Tyr Ser Leu Tyr Ala Leu Tyr 

65 70 75 80 

50 Ala Ala Gin Glu Ala Val Asn His Ala Asn Leu Asp Val Glu Ala Leu 

85 90 95 



Asn Arg Asp Arg Phe Gly Val He Val Ala Ser Gly He Gly Gly He 
100 105 110 

Lys Glu He Glu Asp Gin Val Leu Arg Leu His Glu Lys Gly Pro Lys 

115 120 125 



Arg Val Lys Pro Met Thr Leu Pro Lys Ala Leu Pro Asn Met Ala Ser 
60 130 135 140 
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Gly Asn Val Ala Met Arg Phe Gly Ala Asn Gly Val Cys Lys Ser He 
145 150 155 160 

Asn Thr Ala Cys Ser Ser Ser Asn Asp Ala He Gly Asp Ala Phe Arg 
165 170 175 

Ser He Lys Phe Gly Phe Gin Asp Val Met Leu Val Gly Gly Thr Glu 
180 185 190 

Ala Ser He Thr Pro Phe Ala He Ala Gly Phe Gin Ala Leu Thr Ala 
195 200 205 

Leu Ser Thr Thr Glu Asp Pro Thr Arg Ala Ser He Pro Phe Asp Lys 
210 215 220 

Asp Arg Asn Gly Phe Val Met Gly Glu Gly Ser Gly Met Leu Val Leu 
225 230 235 240 

Glu Ser Leu Glu His Ala Glu Lys Arg Gly Ala Thr He Leu Ala Glu 
245 250 255 

Val Val Gly Tyr Gly Asn Thr Cys Asp Ala Tyr His Met Thr Ser Pro 
260 265 270 

His Pro Glu Gly Gin Gly Ala He Lys Ala He Lys Leu Ala Leu Glu 
275 280 285 

Glu Ala Glu He Ser Pro Glu Gin Val Ala Met Leu Met Leu Thr Glu 
290 295 300 

Arg Gin Leu Leu Pro Met Lys Lys Glu Lys Val Val Leu Ser 
305 310 315 

(2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

Met Gin Ala Val Glu His Phe He Lys Gin Phe Val Pro Glu His Tyr 
15 10 15 

Asp Leu Phe Leu Asp Leu Ser Arg Glu Thr Lys Thr Phe Ser Gly Lys 
20 25 30 

Val Thr He Thr Gly Gin Ala Gin Ser Asp Arg He Ser Leu His Gin 
35 40 45 

Lys Asp Leu Glu He Thr Ser Val Glu Val Ala Gly Gin Ala Arg Pro 
50 55 60 
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Phe Thr Val Asp His Asp Asn Glu Ala Leu His He Glu Leu Ala Glu 
65 70 75 80 

Ala Gly Gin Val Glu Leu Val Leu Ala Phe Ser Gly Lys He Thr Asp 
85 90 95 

Asn Met Thr Gly He Tyr Pro Ser Tyr Tyr Thr Val Asp Gly Val Lys 
100 105 110 

Lys Glu Val Leu Ser Thr Gin Phe Glu Ser His Phe Ala Arg Glu Ala 
115 120 125 

Phe Pro Cys Val Asp Glu Pro Glu Ala Lys Ala Thr Phe Asp Leu Ser 
130 135 140 

Leu Arg Phe Asp Gin Ala Glu Gly Glu Leu Ala Leu Ser Asn Met Pro 
145 150 155 160 

Glu He Asp Val Glu Asn Arg Lys Glu Thr Gly He Trp Lys Phe Glu 
165 170 175 

Thr Thr Pro Arg Met Ser Ser Tyr Leu Leu Ala Phe Val Ala Gly Asp 
180 185 190 

Leu Gin Gly Val Thr Ala Lys Thr Lys Asn Gly Thr Leu Val Gly Val 
195 200 205 

Tyr Ser Thr Lys Ala His Pro Leu Ser Asn Leu Asp Phe Ser Leu Asp 
210 215 220 

He Ala Val Arg Ser lie Glu Phe Tyr Glu Asp Tyr Tyr Gly Val Lys 
225 230 235 240 

Tyr Pro He Pro Gin Ser Leu His He Ala Leu Pro Asp Phe Ser Ala 
245 250 255 

Gly Ala Met Glu Asn Trp Gly Leu Val Thr Tyr Arg Glu Val Tyr Leu 
260 265 270 

Val Val Asp Glu Asn Ser Thr Phe Ala Ser Arg Gin Gin Val Ala Leu 
275 280 285 

Val Val Ala His Glu Leu Ala His Gin Trp Phe Gly Asn Leu Val Thr 
290 295 300 

Met Lys Trp Trp Asp Asp Leu Trp Leu Asn Glu Ser Phe Ala Asn Met 
305 310 315 320 

Met Glu Tyr Val Cys Val Asp Thr He Glu Pro Ser Trp Asn He Phe 
325 330 335 

Glu Asp Phe Gin Thr Gly Gly Val Pro Leu Ala Leu Glu Arg Asp Ala 
340 345 350 

Thr Asp Gly Val Gin Ser Val His Val Glu Val Lys His Pro Asp Glu 
355 360 365 



He Asn Thr Leu Phe Asp Gly Ala He Val Tyr Ala Arg Lys Arg Leu 
370 375 380 
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Met His Met Leu Arg Val Ala Arg Asp Ala Asp Leu 
385 390 395 

(2) INFORMATION FOR SEQ ID NO: 174: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
10 (D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



15 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:174: 

Met Asp Phe Leu Leu Phe Tyr Asp Ser Lys Lys Lys Gly Asp Thr Met 
15 10 15 

Thr Tyr Leu Glu Lys Trp Phe Asp Phe Asn Arg Arg Gin Lys Glu lie 
25 20 25 30 

Glu Ser Leu Leu Glu Glu Thr lie Ala Gin Gin Ser Glu Gin Ser Leu 
35 40 45 

30 Thr Leu Lys Glu Phe Tyr Leu Leu Tyr Tyr Leu Asp Leu Ala Glu Glu 

50 55 60 



35 



Lys Ser Leu Arg Gin lie Asp Leu Pro Asp Lys Leu His Leu Ser Pro 
65 70 75 80 

Ser Ala Val Ser Arg Met Val Ala Arg Leu Glu Ala Lys Asn Cys Gly 
85 90 95 



Leu Leu Ser Arg Met Cys Cys His Gin Asp Arg Arg Ser Ser Phe lie 
40 100 105 110 

Cys Leu Thr Asn Asp Gly Gin Lys Thr Leu Ala Ser Leu Gin Lys 
115 120 125 

45 (2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 373 amino acids 

(B) TYPE: amino acid 

50 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

55 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 
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Met 
1 



Leu Tyr 



Asp Tyr 
5 



Gly 



Asn 



Ser 



Val 



Trp 
10 



Leu 



Ala Ser Met 



Gly 
15 



Thr 



lie Gly Gin Thr Val Leu Gly Met Tyr Gin He Ser Glu Leu Val Thr 
20 25 30 

Ser He Leu Val Asn Pro Phe Gly Gly Val He Ser Asp Arg Phe Ser 
35 40 45 

Arg Arg Lys He Leu Met Thr Ala Asp Leu Val Cys Gly He Leu Cys 
50 55 60 

Leu Ala He Ser Phe He Arg Asn Asp Ser Trp Met He Gly Ala Leu 
65 70 75 80 

He Val Ala Asn He Val Gin Ala He Ala Phe Ala Phe Ser Arg Thr 
85 90 95 

Ala Asn Lys Ala He He Thr Glu Val Val Glu Lys Asn Glu He Val 
100 105 110 

He Tyr Asn Ser Arg Leu Glu Leu Val Leu Gin Val Val Gly Val Ser 
115 120 125 

Ser Pro Val Leu Ser Phe Leu Val Leu Gin Phe Ala Ser Leu His Met 
130 135 140 

Thr Leu Leu Leu Asp Ser Leu Thr Phe Phe He Ala Phe Val Leu Val 
145 150 155 160 

Ala Phe Leu Pro Lys Glu Glu Ala Lys Val Gin Glu Lys Lys Ala Phe 
165 170 175 

Thr Gly Arg Asp He Phe Val Asp He Lys Asp Gly Leu His Tyr He 
180 185 190 

Trp His Gin Gin Glu He Phe Phe Leu Leu Leu Val Ala Ser Ser Val 
195 200 205 

Asn Phe Phe Phe Ala Ala Phe Glu Phe Leu Leu Pro Phe Ser Asn Gin 
210 215 220 

Leu Tyr Gly Ser Glu Gly Ala Tyr Ala Ser He Leu Thr Met Gly Ala 
225 230 235 240 

He Gly Ser He He Gly Ala Leu Leu Ala Ser Lys He Lys Ala Asn 
245 250 255 

He Tyr Asn Leu Leu He Leu Leu Ala Leu Thr Gly Val Gly Val Phe 
260 265 270 

Met Met Gly Leu Pro Leu Pro Thr Phe Leu Ser Phe Ser Gly Asn Leu 
275 280 285 

Val Cys Glu Leu Phe" Met Thr He Phe Asn He His Phe Phe Thr Gin 
290 295 300 

Val Gin Thr Lys Val Glu Ser Glu Phe Leu Gly Arg Val Leu Ser Thr 
305 310 315 320 

He Phe Thr Leu Ala He Leu Phe Met Pro He Ala Lys Gly Phe Met 
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325 



330 



335 



Thr Val Leu Pro Ser Val His Leu Ser Ser Phe Leu He He Gly Ser 
340 345 350 



Gly Val He He Leu Ser Cys He Ser Phe He Tyr Val Arg Thr His 
355 360 365 

Phe Glu Lys Leu He 
370 

(2) INFORMATION FOR SEQ ID NO: 176: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 427 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

{ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

Met Ser Val Ser Phe Glu Asn Lys Glu Thr Asn Arg Gly Val Leu Thr 
15 10 15 

Phe Thr He Ser Gin Asp Gin He Lys Pro Glu Leu Asp Arg Val Phe 
20 25 30 

Lys Ser Val Lys Lys Ser Leu Asn Val Pro Gly Phe Arg Lys Gly His 
35 40 45 

Leu Pro Arg Pro He Phe Asp Gin Lys Phe Gly Glu Glu Ala Leu Tyr 
50 55 60 

Gin Asp Ala Met Asn Ala Leu Leu Pro Asn Ala Tyr Glu Ala Ala Val 
65 70 75 80 

Lys Glu Ala Gly Leu Glu Val Val Ala Gin Pro Lys He Asp Val Thr 
85 90 95 

Ser Met Glu Lys Gly Gin Asp Trp Val He Thr Ala Glu Val Val Thr 
100 105 110 

Lys Pro Glu Val Lys Leu Gly Asp Tyr Lys Asn Leu Glu Val Ser Val 
115 120 125 

Asp Val Glu Lys Glu Val Thr Asp Ala Asp Val Glu Glu Arg He Glu 
130 135 140 

Arg Glu Arg Asn Asn Leu Ala Glu Leu Val He Lys Glu Ala Ala Ala 
145 150 155 160 

Glu Asn Gly Asp Thr Val Val lie Asp Phe Val Gly Ser He Asp Gly 
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165 



170 



175 



Val Glu Phe Asp Gly Gly Lys Gly Glu Asn Phe Ser Leu Gly Leu Gly 
180 185 190 

Ser Gly Gin Phe lie Pro Gly Phe Glu Asp Gin Leu Val Gly His Ser 
195 200 205 

Ala Gly Glu Thr Val Asp Val lie Val Thr Phe Pro Glu Asp Tyr Gin 
210 215 220 

Ala Glu Asp Leu Ala Gly Lys Glu Ala Lys Phe Val Thr Thr He His 
225 230 235 240 

Glu Val Lys Ala Lys Glu Val Pro Ala Leu Asp Asp Glu Leu Ala Lys 
245 250 255 

Asp He Asp Glu Glu Val Glu Thr Leu Ala Asp Leu Lys Glu Lys Tyr 
260 265 270 

Arg Lys Glu Leu Ala Ala Ala Lys Glu Glu Thr Tyr Lys Asp Ala Val 
275 280 285 

Glu Gly Ala Ala He Asp Thr Ala Val Glu Asn Ala Glu He Val Glu 
290 295 300 

Leu Pro Glu Glu Met He His Glu Glu Val His Arg Ser Val Asn Glu 
305 310 315 320 

Phe Leu Gly Asn Leu Gin Arg Gin Gly He Asn Pro Asp Met Tyr Phe 
325 330 335 

Gin He Thr Gly Thr Thr Gin Glu Asp Leu His Asn Gin Tyr Gin Ala 
340 345 350 

Glu Ala Glu Ser Arg Thr Lys Thr Asn Leu Val He Glu Ala Val Ala 
355 360 365 

Lys Ala Glu Gly Phe Asp Ala Ser Glu Glu Glu He Gin Lys Glu Val 
370 375 380 

Glu Gin Leu Ala Ala Asp Tyr Asn Met Glu Val Ala Gin Val Gin Asn 
385 390 395 400 

Leu Leu Ser Ala Asp Met Leu Lys His Asp He Thr He Lys Lys Ala 
405 410 415 

Val Glu Leu He Thr Ser Thr Ala Thr Val Lys 



(2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 207 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



420 



425 
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15 



30 



45 



55 



60 



(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177 : 



Gly Gly Asp Lys Asp Phe Leu Thr Ser lie Cys Leu Thr Asn Asp Pro 
15 10 15 

10 Phe Leu Gly Phe Arg Ala Leu Arg lie Ser lie Ser Glu Thr Gly Asp 

20 25 30 



Ala Met Phe Arg Thr Gin lie Arg Ala Leu Leu Arg Ala Ser Val His 
35 40 45 

Gly Gin Leu Arg lie Met Phe Pro Met Val Ala Leu Leu Lys Glu Phe 
50 55 60 



Arg Ala Ala Lys Ala Val Phe Asp Glu Glu Lys Ala Asn Leu Leu Ala 
20 65 70 75 80 

Glu Gly Val Ala Val Ala Asp Asn He Gin Val Gly He Met He Glu 
85 90 95 

25 He Pro Ala Ala Ala Met Leu Ala Asp Gin Phe Ala Lys Glu Val Asp 

100 105 110 



Phe Phe Ser He Gly Thr Asn Asp Leu He Gin Tyr Thr Met Ala Ala 
115 120 125 

Asp Arg Met Asn Glu Gin Val Ser Tyr Leu Tyr Gin Pro Tyr Asn Pro 
130 135 140 



Ser He Leu Arg Leu He Asn Asn Val He Lys Ala Ala His Ala Glu 
35 145 150 155 160 

Gly Lys Trp Ala Gly Met Cys Gly Glu Met Ala Gly Asp Gin Gin Ala 
165 170 175 

40 Val Pro Leu Leu Val Gly Met Gly Leu Asp Glu Phe Ser Met Ser Ala 

180 185 190 



Thr Cys Thr Ser Tyr Thr Gin Leu Asp Glu Glu Thr Arg His Ser 
195 200 205 

(2) INFORMATION FOR SEQ ID NO: 178: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 283 amino acids 
50 (B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

Met Gin Met Ala Tyr Arg Cys Asn Leu Arg Asn Asn Gly Lys Arg Arg 
15 10 15 

lie Gly He Arg Glu Met Thr Glu Met Leu Lys Gly He Ala Ala Ser 
20 25 30 

Asp Gly Val Ala Val Ala Lys Ala Tyr Leu Leu Val Gin Pro Asp Leu 
35 40 45 

Ser Phe Glu Thr He Thr Val Glu Asp Thr Asn Ala Glu Glu Ala Arg 
50 55 60 

Leu Asp Ala Ala Leu Gin Ala Ser Gin Asp Glu Leu Ser Val He Arg 
65 70 75 80 

Glu Lys Ala Val Gly Thr Leu Gly Glu Glu Ala Ala Gin Val Phe Asp 
85 90 95 

Ala His Leu Met Val Leu Ala Asp Pro Glu Met He Ser Gin He Lys 
100 105 110 

Glu Thr He Arg Ala Lys Lys Val Asn Ala Glu Ala Gly Leu Lys Glu 
115 120 125 

Val Thr Asp Met Phe He Thr He Phe Glu Gly Met Glu Asp Asn Pro 
130 135 140 

Tyr Met Gin Glu Arg Ala Arg Asp He Arg Asp Val Thr Lys Arg Val 
145. 150 155 160 

Leu Ala Asn Leu Leu Gly Lys Lys Leu Pro Asn Pro Ala Ser He Asn 
165 170 175 

Glu Glu Val He Val He Ala His Asp Leu Thr Pro Ser Asp Thr Ala 
180 185 190 

Gin Leu Asp Lys Asn Phe Val Lys Ala Phe Val Thr Asn He Gly Gly 
195 200 205 

Arg Thr Ser His Ser Ala He Met Ala Arg Thr Leu Glu He Ala Ala 
210 215 220 

Val Leu Gly Thr Asn Asn He Thr Glu He Val Lys Asp Gly Asp He 
225 230 235 240 

Leu Ala Val Asn Gly He Thr Gly Glu Val He He Asn Pro Thr Asp 
245 250 255 

Glu Gin Ala Ala Glu Phe Lys Ala Ala Gly Glu Ala Tyr Ala Thr Lys 
260 265 270 

Ala Glu Trp Ala Leu Leu Lys Asp Ala Gin Gin 
275 280 

(2) INFORMATION FOR SEQ ID NO: 179: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 168 amino acids 



10 



25 



40 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:179: 



Met lie Gly Arg Leu Ala Pro Tyr Asp Lys Gly Gin lie lie Tyr Asp 
15 1 5 10 15 

Gly Thr Ser Leu Lys Asp lie Lys Pro Ser Val Phe Phe Arg Asp Tyr 
20 25 30 

20 Leu Gly Tyr Leu Phe Gin Asp Phe Gly Leu lie Glu Ser Gin Thr Val 

35 40 45 



Lys Glu Asn Leu Asn Leu Gly Leu Val Gly Lys Lys Leu Lys Glu Lys 
50 55 60 

Glu Lys lie Ser Leu Met Lys Gin Ala Leu Asn Arg Val Asn Leu Ser 
65 70 75 80 



Tyr Leu Asp Leu Lys Gin Pro lie Phe Glu Leu Ser Gly Gly Glu Ala 
30 85 90 95 

Gin Arg Val Ala Leu Ala Lys lie lie Leu Lys Asp Pro Pro Leu lie 

100 105 110 

35 Leu Ala Asp Glu Pro Thr Ala Ser Leu Asp Pro Lys Asn Ser Glu Glu 

115 120 125 



Leu Leu Ser lie Leu Glu Ser Leu Lys Asn Pro Asn Arg Thr lie lie 
130 135 140 

lie Ala Thr His Asn Pro Leu He Trp Glu Gin Val Asp Gin Val He 
145 150 155 160 



Arg Val Thr Asp Leu Ser His Arg 
45 165 

(2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 210 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

55 (ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

60 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:180: 

Met Lys Ala His Val Ser Tyr Leu Ser Met Gly Glu Lys Arg Phe Val 
15 10 15 

5 

Tyr Asn Asn Gly Glu Asn Pro Val Ser Thr Gin Tyr Leu Thr Asp Pro 
20 25 30 

He Leu Val Val Phe Thr Pro Thr Ser Thr Gly Asp Ser Phe He Ser 
10 35 40 45 

Leu Ser Ser Trp Ser He Asn Ala Gly Lys Gin Leu Phe He Lys Gly 
50 55 60 

15 Tyr Glu Ser Gly Leu Glu Leu Leu Lys Lys Ala Gly He Tyr Glu Gin 

65 70 75 80 



20 



35 



Val Ser Tyr Leu Lys Glu Gly Arg Ser Val Tyr Leu Thr Arg Tyr Asn 
85 90 95 

Glu Val Gin Thr Glu Thr Ala Thr Leu He Leu Gly Ala He Val Gly 
100 105 110 



He Ala Ser Ser Leu Leu Leu Phe Tyr Ser Val Asn Leu Leu Tyr Phe 

25 115 120 125 

Glu Gin Phe Arg Arg Asp He Leu He Lys Arg He Ser Gly Leu Arg 

130 135 140 

30 Phe Phe Glu Thr His Ala Gin Tyr Met Val Ser Gin Phe Ala Ser Phe 

145 150 155 160 



Val Phe Gly Ala Ser Leu Phe He Leu Ser Ser Arg Asp Leu Val He 
165 170 175 

Gly Leu Leu Thr Leu Leu Val Phe Leu Ala Ser Ala Val Leu Thr Leu 
180 185 190 



Tyr Arg Gin Ala Gin Lys Glu Ser Arg Val Ser Met Thr He Met Lys 
40 195 200 205 

Gly Lys 
210 

45 (2) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 227 amino acids 
<B) TYPE: amino acid 
50 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

55 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:181: 
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Glu Phe Gin Glu Ala Ser Gin Glu Ser Arg Glu Arg Ser Asp Pro Leu 
15 10 15 

Asn Ser Tyr Leu Leu Leu Ser Gly Ser Leu Thr Lys Glu Lys Leu Ala 
20 25 30 

Asp Lys Leu Gly Asp Leu Gly Tyr Lys Ala Ser Ala Asp Arg Lys lie 
35 40 45 

Pro Pro Tyr Phe Leu Ala Phe Arg lie Leu Leu Asn Pro Leu lie Leu 
50 55 60 

lie Ser Leu Ala lie Phe Gly Leu Ser Phe Phe Ala Leu Val lie He 
65 70 75 80 

Thr Arg He Lys Glu Met Arg Ala Ala Gly He Lys Leu Phe Ser Gly 
85 90 95 

Gin Thr Leu Leu Ser He Met Gly His Ser Leu Ser Thr Asp He Lys 
100 105 110 

Trp Leu Leu Leu Ser Ala Leu Leu Ser Phe Leu Gly Gly Gly Val Val 
115 120 125 

Leu Phe Ser Gin Gly Leu Phe Tyr Pro He Leu Leu Ala Thr Tyr Gly 
130 135 140 

Phe Gly He Ser Phe Tyr Leu Leu Phe Leu Leu Ala He Ser He Leu 
145 150 155 160 

Leu Met Leu Leu Tyr Leu Met Ser Leu Asn Lys Ala Leu Val Pro Val 
165 170 175 

He Arg Gly Arg Phe Pro Leu Leu Met Thr Leu Phe Gin Pro Val Phe 
180 185 190 

Ser Val Gly Tyr Ala Lys Thr Gly Leu Thr Ser Tyr Gin Arg Leu Lys 
195 200 205 

Glu Leu Glu He Ser Gin Trp Gin Asp Arg Val Asp Tyr Tyr His Asp 
210 215 220 

Phe Phe Thr 
225 

{2) INFORMATION FOR SEQ ID NO:182: 

(ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 
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Met Ser Lys Asp Lys Lys Asn Glu Asp Lys Glu Thr Leu Glu Glu Leu 
15 10 15 

Lys Glu Leu Ser Glu Trp Gin Lys Arg Asn Gin Glu Tyr Leu Lys Lys 
20 25 30 

Lys Ala Glu Glu Glu Val Ala Leu Ala Glu Glu Lys Glu Lys Glu Arg 
35 40 45 

Gin Ala Arg Met Gly Glu Glu Ser Glu Lys Ser Glu Asp Lys Gin Asp 
50 55 60 

Gin Glu Ser Glu Thr Asp Gin Glu Asp Ser Glu Ser Ala Lys Glu Glu 
65 70 75 80 

Ser Glu Glu Lys Val Ala Ser Ser Glu Ala Asp Lys Glu Lys Glu Glu 
85 90 95 

Pro Glu Ser Lys Glu Lys Glu Glu Gin Asp Lys Lys Leu Ala Lys Lys 
100 105 110 

Ala Thr Lys Glu Lys Pro Ala Lys Ala Lys lie Pro Gly lie His lie 
115 120 125 

Leu Arg Ala Phe Thr lie Leu Phe Pro Ser Leu Leu Leu Leu lie Val 
130 135 140 

Ser Ala Tyr Leu Leu Ser Pro Tyr Ala Thr Met Lys Asp lie Arg Val 
145 150 155 160 

Glu Gly Thr Val Gin Thr Thr Ala Asp Asp lie Arg Gin Ala Ser Gly 
165 170 175 

He Gin Asp Ser Asp Tyr Thr He Asn Leu Leu Leu Asp Lys Ala Lys 
180 185 190 

Tyr Glu Lys Gin He Lys Ser Asn Tyr Trp Val Glu Ser Ala Gin Leu 
195 200 205 

Val Tyr Gin Phe Pro Thr Lys Phe Thr He Lys Val Lys Glu Tyr Asp 
210 215 220 

He Val Ala Tyr Tyr He Ser Gly Glu Asn His Tyr Pro He Leu Ser 
225 230 235 240 

Ser Gly Gin Leu Glu Thr Ser Ser Val Ser Leu Asn Ser Leu Pro Glu 
245 250 255 

Thr Tyr Leu Ser Val Leu Phe Asn Asp Ser Glu Gin He Lys Val Phe 
260 265 270 

Val Ser Glu Leu Ala Gin He Ser Pro Glu Leu Lys Ala Ala He Gin 
275 280 285 

Lys Val Glu Leu Ala Pro Ser Lys Val Thr Ser Asp Leu He Arg Leu 
290 295 300 



Thr Met Asn Asp Ser Asp Glu Val Leu Val Pro Leu Ser Glu Met Ser 
305 310 315 320 
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Lys Lys Leu Pro Tyr Tyr Ser Lys lie Lys Pro Gin Leu Ser Glu Pro 
325 330 335 

Ser Val Val Asp Met Glu Ala Gly lie Tyr Ser Tyr Thr Val Ala Asp 
5 340 345 350 

Lys Leu He Met Glu Ala Glu Glu Lys Ala Lys Gin Glu Ala Lys Glu 
355 360 365 

10 Ala Glu Lys Lys Gin Glu Glu Glu Gin Lys Lys Gin Glu Glu Glu Ser 

370 375 380 



15 



25 



30 



35 



50 



60 



Asn Arg Asn Gin Thr Asn Gin Arg Ser Ser Arg Arg 
385 390 395 

(2) INFORMATION FOR SEQ ID NO: 183: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 165 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
<iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

Met Glu Arg Val Val Asp He Leu Lys Ala Glu Phe Asp Arg Ser Phe 
15 10 15 

Lys Leu He Asn Ser Lys Thr Tyr Pro Val Ser Gly Gly Glu Leu Asn 
20 25 30 



Pro Ala Asn Val Asp Ser Glu He Glu Ala Phe Ala Gin Leu Gly Val 
40 35 40 45 

Ser Arg Gly Leu Asp Ser Lys Glu Ala His Tyr Leu Ala Asn Leu Tyr 

. 50 55 60 

45 Gly Ser Asn Ala Pro Lys Val Phe Ala Leu Ala His Ser Leu Glu Gin 

65 70 75 80 



Ala Pro Gly Leu Ser Leu Ala Asp Thr Leu Ser Leu His Tyr Ala Met 
85 90 95 

Arg Asn Glu Leu Ala Leu Ser Pro Val Asp Phe Leu Leu Arg Arg Thr 
100 105 HO 



Asn His Met Leu Phe Met Arg Asp Ser Leu Asp Ser He Val Glu Pro 
55 115 120 125 



Val Leu Asp Glu Met Gly Arg Phe Tyr Asp Trp Thr Glu Glu Glu Lys 
130 135 140 

Ala Thr Tyr Arg Ala Asp Val Glu Ala Ala Leu Ala Asn Asn Asp Leu 
145 150 155 160 



WO 98/26072 PCT/US97/22578 

-273- 



Ala Glu Leu Lys Asn 
165 

5 (2) INFORMATION FOR SEQ ID NO: 184: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

15 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

Met Asn Glu Leu Phe Gly Glu Phe Leu Gly Thr Leu He Leu He Leu 
15 10 15 

25 Leu Gly Asn Gly Val Val Ala Gly Val Val Leu Pro Lys Thr Lys Ser 

20 25 30 

Asn Ser Ser Gly Trp He Val He Thr Met Gly Trp Gly He Ala Val 
35 40 45 

30 

Ala Val Ala Val Phe Val Ser Gly Lys Leu Ser Pro Ala His Leu Asn 
50 55 60 

Pro Ala Val Thr He Gly Val Ala Leu Lys Gly Gly Leu Pro Trp Ala 
35 65 70 75 80 

Ser Val Leu Pro Tyr He Leu Ala Gin Phe Ala Gly Ala Met Leu Gly 
85 90 95 

40 Gin He Leu Val Trp Leu Gin Phe Lys Pro His Tyr Glu Ala Glu Glu 

100 105 HO 



45 



Asn Ala Gly Asn He Leu Ala Thr Phe Ser Thr Gly Pro Ala He Lys 
115 120 125 

Asp Thr Val Ser Asn Leu He Ser Glu He Leu Gly Thr Phe Val Leu 
130 135 140 



Val Leu Thr He Phe Ala Leu Gly Leu Tyr Asp Phe Gin Ala Gly He 
50 145 150 155 160 

Gly Thr Phe Ala Val Gly Thr Leu He Val Gly He Gly Leu Ser Leu 

165 170 175 

55 Gly Gly Thr Thr Gly Tyr Ala Leu Asn Pro Ala Arg Asp Leu Gly Pro 

180 185 190 



60 



Arg He Met His Ser He Leu Pro He Pro Asn Lys Gly Asp Gly Asp 
195 200 205 

Trp Ser Tyr Ala Trp He Pro Val Val Gly Pro Val He Gly Ala Ala 
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210 215 220 

Leu Ala Val Leu Val Leu Ser Leu Phe 
225 230 

5 

(2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 135 amino acids 
10 (B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 



15 



20 



25 



40 



<ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

Thr Thr Asp Asn Val lie Asp Leu Phe Glu His lie Phe Lys Met Phe 
15 10 15 

Asn Glu Asn lie Val Met Ala Gly Lys Val Asn Leu Leu Asn Phe Ala 
20 25 30 



Asn Leu Ala Ala Tyr Gin Phe Phe Asp Gin Pro Gin Lys Val Ala Leu 
30 35 40 45 

Glu lie Arg Glu Gly Leu Arg Glu Asp Gin Met Gin Asn Val Arg Val 
50 55 60 

35 Ala Asp Gly Gin Glu Ser Cys Leu Ala Asp Leu Ala Val lie Ser Ser 

65 70 75 80 



Lys Phe Leu He Pro Tyr Arg Gly Val Gly He Leu Ala He He Gly 
85 90 95 

Pro Val Asn Leu Asp Tyr Gin Gin Leu He Asn Gin lie Asn Val Val 
100 105 110 



Asn Arg Val Leu Thr Met Lys Leu Thr Asp Phe Tyr Arg Tyr Leu Ser 
45 115 120 125 

Ser Asn His Tyr Glu Val His 
130 135 

50 (2) INFORMATION FOR SEQ ID NO:186: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 153 amino acids 

(B) TYPE: amino acid 

55 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

60 (iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

Met lie Ala Lys Glu Phe Glu Thr Phe Leu Leu Gly Gin Glu Glu Thr 
15 10 15 

Phe Leu Thr Pro Ala Lys Asn Leu Ala Val Leu lie Asp Thr His Asn 
20 25 30 

Ala Asp His Ala Thr Leu Leu Leu Ser Gin Met Thr Tyr Thr Arg Val 
35 40 45 

Pro Val Val Thr Asp Glu Lys Gin Phe Val Gly Thr He Gly Leu Arg 
50 55 60 

Asp He Met Ala Tyr Gin Met Glu His Asp Leu Ser Gin Glu He Met 
65 70 75 80 

Ala Asp Thr Asp He Val His Met Thr Lys Thr Asp Val Ala Val Val 
85 90 95 

Ser Pro Asp Phe Thr He Thr Glu Val Leu His Lys Leu Val Asp Glu 
100 105 HO 

Ser Phe Leu Pro Val Val Asp Ala Glu Gly He Phe Gin Gly He He 
115 120 125 

Thr Arg Lys Ser He Leu Lys Ala Val Asn Ala Leu Leu His Asp Phe 
130 135 140 

Ser Lys Glu Tyr Glu He Arg Cys Gin 
145 150 

(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

Met Ala Lys Gin Thr He He Val Met Ser Asp Ser His Gly Asp Ser 
15 10 15 

Leu He Val Glu Glu Val Arg Asp Arg Tyr Val Gly Lys Val Asp Ala 
20 25 30 

Val Phe His Asn Gly Asp Ser Glu Leu Arg Pro Asp Ser Pro Leu Trp 
35 40 45 
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Glu Gly lie Arg Val Val Lys Gly Asn Met Asp Phe Tyr Ala Gly Tyr 
50 55 60 

Pro Glu Arg Leu Val Thr Glu Leu Gly Ser Thr Lys lie lie Gin Thr 
5 65 70 75 80 

His Gly His Leu Phe Asp lie Asn Phe Asn Phe Gin Lys Leu Asp Tyr 
85 90 95 

10 Trp Ala Gin Glu Glu Glu Ala Ala He Cys Leu Tyr Gly His Leu His 

100 105 110 



15 



50 



Val Pro Ser Ala Trp Met Glu Gly Lys He Leu Phe Leu Asn Pro Gly 
115 120 125 

Ser He Ser Gin Pro Arg Gly Thr He Arg Glu Cys Leu Tyr Ala Arg 
130 135 140 



Val Glu He Asp Asp Ser Tyr Phe Lys Val Asp Phe Leu Thr Arg Asp 
20 145 150 155 160 

His Glu Val Tyr Pro Gly Leu Ser Lys Glu Phe Ser Arg 
165 170 

25 (2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 189 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

35 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 

Met Ser Thr Leu Ala Lys He Glu Ala Leu Leu Phe Val Ala Gly Glu 
15 10 15 

45 Asp Gly He Arg Val Arg Gin Leu Ala Glu Leu Leu Ser Leu Pro Pro 

20 25 30 



Thr Gly He Gin Gin Ser Leu Gly Lys Leu Ala Gin Lys Tyr Glu Lys 
35 40 45 

Asp Pro Asp Ser Ser Leu Ala Leu He Glu Thr Ser Gly Ala Tyr Arg 
50 55 60 



Leu Val Thr Lys Pro Gin Phe Ala Glu He Leu Lys Glu Tyr Ser Lys 

55 65 70 75 80 

Ala Pro lie Asn Gin Ser Leu Ser Arg Ala Ala Leu Glu Thr Leu Ser 
85 90 95 

60 He He Ala Tyr Lys Gin Pro He Thr Arg He Glu He Asp Ala He 

100 105 110 



10 



30 



45 



60 
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Arg Gly Val Asn Ser Ser Gly Ala Leu Ala Lys Leu Gin Ala Phe Asp 
115 120 125 

Leu lie Lys Glu Asp Gly Lys Lys Glu Val Leu Gly Arg Pro Asn Leu 
130 135 140 

Tyr Val Thr Thr Asp Tyr Phe Leu Asp Tyr Met Gly lie Asn His Leu 
145 150 155 160 

Glu Glu Leu Pro Val lie Asp Glu Leu Glu He Gin Ala Gin Glu Ser 
165 170 175 



Gin Leu Phe Gly Glu Arg He Glu Glu Asp Glu Asn Gin 
15 180 185 

(2) INFORMATION FOR SEQ ID NO:189: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 214 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

25 (ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 



. Met Arg Asp Arg He Ser Ala Phe Leu Glu Glu Lys Gin Gly Leu Ser 
35 1 5 10 15 

Val Asn Ser Lys Gin Ser Tyr Lys Tyr Asp Leu Glu Gin Phe Leu Asp 

20 25 30 

40 Met Val Gly Glu Arg He Ser Glu Thr Ser Leu Lys He Tyr Gin Ala 

35 40 45 



Gin Leu Ala Asn Leu Lys He Ser Ala Gin Lys Arg Lys He Ser Ala 
50 55 60 

Cys Asn Gin Phe Leu Tyr Phe Leu Tyr Gin Lys Gly Glu Val Asp Ser 
65 70 75 80 



Phe Tyr Arg Leu Glu Leu Ala Lys Gin Ala Glu Lys Lys Thr Glu Lys 
50 85 90 95 

Pro Glu He Leu Tyr Leu Asp Ser Phe Trp Gin Glu Ser Asp His Pro 
100 105 110 

55 Glu Gly Arg Leu Leu Ala Leu Leu He Leu Glu Met Gly Leu Leu Pro 

115 120 125 



Ser Glu He Leu Ala He Lys Val Ala Asp He Asn Leu Asp Phe Gin 
130 135 140 

Val Leu Arg He Ser Lys Ala Ser Gin Gin Arg He Val Thr He Pro 
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145 150 155 160 

Thr Ala Leu Leu Ser Glu Leu Glu Pro Leu Met Gly Gin Thr Tyr Leu 
165 170 175 

5 

Phe Glu Arg Gly Glu Lys Pro Tyr Ser Arg Gin Trp Ala Phe Arg Gin 
180 185 190 

Leu Glu Ser Phe Val Arg Arg Arg Phe Pro Ser Leu Ser Ala Gin Val 
10 195 200 205 

Leu Arg Asp Ser Leu Phe 
210 

15 (2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 239 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

25 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:190: 

Met Arg He Asn Lys Tyr He Ala His Ala Gly Val Ala Ser Arg Arg 
15 10 15 

35 Lys Ala Glu Glu Leu He Lys Gin Gly Leu Val Thr Val Asn Gly Gin 

20 25 30 



40 



55 



Val Val Arg Glu Leu Ala Thr Thr He Lys Ser Gly Asp Lys Val Glu 
35 40 45 

Val Glu Gly Gin Pro He Tyr Asn Glu Glu Lys Val Tyr Tyr Leu Leu 
50 55 60 



Asn Lys Pro Arg Gly Val He Ser Ser Val Thr Asp Asp Lys Gly Arq 
45 65 70 75 80 

Lys Thr Val Val Asp Leu Leu Pro Asn Val Lys Glu Arg He Tyr Pro 
85 90 95 

50 Val Gly Arg Leu Asp Trp Asp Thr Ser Gly Val Leu He Leu Thr Asn 

100 105 no 



Asp Gly Asp Phe Thr Asp Glu Met He His Pro Arg Asn Glu He Asp 
115 120 125 

Lys Val Tyr Val Ala Arg Val Lys Gly Val Ala Asn Lys Asp Asn Leu 
130 135 140 



Arg Pro Leu Thr Arg Gly Leu Glu He Asp Gly Lys Lys Thr Lys Pro 
60 145 150 155 160 
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Ala Val Tyr Glu lie Leu Lys Val Asp Pro Val Lys Asn Arg Ser Val 
165 170 175 

Val Gin Leu Thr lie His Glu Gly Arg Asn His Gin Val Lys Lys Met 
180 185 190 

Phe Glu Ala Val Gly Leu Gin Val Asp Lys Leu Ser Arg Thr Arg Phe 
195 200 205 

Gly His Leu Asp Leu Thr Leu Arg Pro Gly Glu Ser Arg Arg Leu Asn 
210 215 220 

Lys Lys Glu lie Ser Gin Leu His Thr Met Ala Val Thr Lys Lys 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 243 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:191: 

Met Asp He Lys Leu Lys Arg Phe Leu Lys Asp Pro Gly Leu Ala Leu 
15 10 15 

Cys He Trp Phe Leu Ser Thr Lys Met Asp He Tyr Asp Val Pro He 
20 25 30 

Thr Glu Val He Glu Gin Tyr Leu Ala Tyr Val Ser Thr Leu Gin Ala 
35 40 45 

Met Arg Leu Glu Val Thr Gly Glu Tyr Met Val Met Ala Ser Gin Leu 
50 55 60 

Met Leu He Lys Ser Arg Lys Leu Leu Pro Lys Val Ala Glu Val Thr 
65 70 75 80 

Asp Leu Gly Asp Asp Leu Glu Gin Asp Leu Leu Ser Gin He Glu Glu 
85 90 95 

Tyr Arg Lys Phe Lys Leu Leu Gly Glu His Leu Glu Ala Lys His Gin 
100 105 110 

Glu Arg Ala Gin Tyr Tyr Ser Lys Ala Pro Thr Glu Leu He Tyr Glu 
115 120 125 

Asp Ala Glu Leu Val His Asp Lys Thr Thr He Asp Leu Phe Leu Ala 
130 135 140 

Phe Ser Asn He Leu Ala Lys Lys Lys Glu Glu Phe Ala Gin Asn His 
145 150 155 160 
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Thr Thr He Leu Arg Asp Glu Tyr Lys He Glu Asp Met Met He He 
165 170 175 

Val Lys Glu Ser Leu He Gly Arg Asp Gin Leu Arg Leu Gin Asp Leu 
180 185 190 

Phe Lys Glu Ala Gin Asn Val Gin Glu Val lie Thr Leu Phe Leu Ala 
195 200 205 

Thr Leu Glu Leu He Lys Thr Gin Glu Leu He Leu Val Gin Glu Glu 
210 215 220 

Ser Phe Gly Asp He Tyr Leu Met Glu Lys Lys Glu Glu Ser Gin Val 
225 230 235 240 

Pro Gin Ser 



(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 336 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 

Met Ala Gly Lys Arg Asp Ser Cys Gly Ala Cys Arg He Met Thr Asn 
15 10 15 

Lys He Tyr Glu Tyr Lys Asp Asp Gin Asn Trp Tyr Val Gly Ser Tyr 
20 25 30 

Ser He Phe Gly Gly Val Asn Ser Leu Ser Asp Tyr Lys Ala Asp Phe 
35 40 45 

Pro Leu Phe Glu Phe Ser Lys He Phe Gly Asp Glu Glu Tyr Gly Phe 
50 55 60 

Pro Leu Ser Val Thr Val Leu Arg Tyr Gly 3er Thr Tyr Arg Leu Phe 
65 70 75 80 

Ser Phe Val Val Asp Met Leu Asn Gin Glu Met Gly Arg Asn Leu Glu 
85 90 95 

Val He Gin Arg His Gly Ala Leu Leu Leu Val Glu Asn Gly Gin Leu 
100 105 110 

Leu Tyr Val Glu Leu Pro Lys Glu Gly Val Asn Val His Asp Phe Phe 
115 120 125 

Glu Thr Ser Lys Val Arg Glu Thr Leu Leu He Ala Thr Arg Asn Glu 
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130 135 140 

Gly Lys Thr Lys Glu Phe Arg Ala lie Phe Asp Lys Leu Gly Tyr Asp 
145 150 155 160 

Val Glu Asn Leu Asn Asp Tyr Pro Asp Leu Pro Glu Val Ala Glu Thr 
165 170 175 

Gly Met Thr Phe Glu Glu Asn Ala Arg Leu Lys Ala Glu Thr lie Ser 
180 185 190 

Gin Leu Thr Gly Lys Met Val Leu Ala Asp Asp Ser Gly Leu Lys Val 
195 200 205 

Asp Val Leu Gly Gly Leu Pro Gly Val Trp Ser Ala Arg Phe Ala Gly 
210 215 220 

Val Gly Ala Thr Asp Arg Glu Asn Asn Ala Lys Leu Leu His Glu Leu 
225 230 235 240 

Ala Met Val Phe Glu Leu Lys Asp Arg Ser Ala Gin Phe His Thr Thr 
245 250 255 

Leu Val Val Ala Ser Pro Asn Lys Glu Ser Leu Val Val Glu Ala Asp 
260 265 270 

Trp Ser Gly Tyr lie Asn Phe Glu Pro Lys Gly Glu Asn Gly Phe Gly 
275 280 285 

Tyr Asp Pro Leu Phe Leu Val Gly Glu Thr Gly Glu Ser Ser Ala Glu 
290 295 300 

Leu Thr Leu Glu Glu Lys Asn Ser Gin Ser His Arg Ala Leu Ala Val 
305 310 315 320 

Lys Lys Leu Leu Glu Val Phe Pro Ser Trp Gin Ser Lys Pro Ser Leu 
325 330 335 



(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 219 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

Glu Asn Asn Tyr Glu Pro Gin Tyr lie Asn lie Arg Gly Lys Gly Pro 
15 10 15 

Leu lie Asn Asp Leu Lys Lys Glu Ala Lys Lys Ala Asn Lys Val Phe 
20 25 30 
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10 



20 



25 



55 



60 



Leu Ala Ser Asp Pro Asp Arg Glu Gly Glu Ala lie Ser Trp His Leu 
35 40 45 

Ala His lie Leu Asn Leu Asp Glu Asn Asp Ala Asn Arg Val Val Phe 
50 55 60 

Asn Glu lie Thr Lys Asp Ala Val Lys Asn Ala Phe Lys Glu Pro Arg 
65 70 75 80 

Lys lie Asp Met Asp Leu Val Asp Ala Gin Gin Ala Arg Arg He Leu 
85 90 95 



Asp Arg Leu Val Gly Tyr Ser He Ser Pro He Leu Trp Lys Lys Val 
15 100 105 110 



Lys Lys Gly Leu Ser Ala Gly Arg Val Gin Ser He Ala Leu Lys Leu 
115 120 125 

He He Asp Arg Glu Asn Glu He Asn Ala Phe Gin Pro Glu Glu Tyr 
130 135 140 

Trp Thr Val Asp Ala Val Phe Lys Lys Gly Thr Lys Gin Phe His Ala 
145 150 155 160 

Ser Phe Tyr Gly Val Asp Gly Lys Lys Met Lys Leu Thr Ser Asn Asn 
165 170 175 



Glu Val Lys Glu Val Leu Ser Arg Leu Thr Ser Lys Asp Phe Ser Val 
30 180 185 190 

Asp Gin Val Asp Lys Lys Glu Arg Lys Ala Asn Ala Pro Leu Pro Tyr 
195 200 205 

35 Thr Thr Ser Ser Met Gin Met Gly Cys Cys Gin 

210 215 

(2) INFORMATION FOR SEQ ID NO: 194: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 236 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

45 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
50 (iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

Met Ser He His He Ala Ala Gin Gin Gly Glu He Ala Asp Lys He 
15 10 15 

Leu Leu Pro Gly Asp Pro Leu Arg Ala Lys Phe He Ala Glu Asn Phe 
20 25 30 

Leu Gly Asp Ala Val Cys Phe Asn Glu Val Arg Asn Met Phe Gly Tyr 
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35 40 45 

Thr Gly Thr Tyr Lys Gly His Arg Val Ser Val Met Gly Thr Gly Met 
50 55 60 

5 

Gly Met Pro Ser lie Ser lie Tyr Ala Arg Glu Leu lie Val Asp Tyr 
65 70 75 80 

Gly Val Lys Lys Leu lie Arg Val Gly Thr Ala Gly Ser Leu Asn Glu 
10 85 90 95 

Glu Val His Val Arg Glu Leu Val Leu Ala Gin Ala Ala Ala Thr Asn 
100 105 110 

15 Ser Asn lie Val Arg Asn Asp Trp Pro Gin Tyr Asp Phe Pro Gin lie 

115 120 125 



20 



35 



40 



50 



55 



Ala Ser Phe Asp Leu Leu Asp Lys Ala Tyr His He Ala Lys Glu Leu 
130 135 140 

Gly Met Thr Thr His Val Gly Asn Val Leu Ser Ser Asp Val Phe Tyr 
145 150 155 160 



Ser Asn Tyr Phe Glu Lys Asn He Glu Leu Gly Lys Trp Gly Val Lys 

25 165 170 175 

Ala Val Glu Met Glu Ala Ala Ala Leu Tyr Tyr Leu Ala Ala Gin Tyr 
180 185 190 

30 His Val Asp Ala Leu Ala He Met Thr He Ser Asp Ser Leu Val Asn 

195 200 205 



Pro Asp Glu Asp Thr Thr Ala Glu Glu Arg Gin Asn Thr Phe Thr Asp 
210 215 220 

Met Met Lys Val Gly Leu Glu Thr Leu He Ala Glu 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 195: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 477 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
45 (D) TOPOLOGY: not relevant 



<ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 



He He Phe Pro He Leu Thr Gly Thr Tyr Val Ala Arg Val Leu Asp 
15 10 15 

Arg Thr Asp Tyr Gly Tyr Phe Asn Ser Val Asp Thr He Leu Ser Phe 

60 20 25 30 
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Phe Leu Pro Phe Ala Thr Tyr Gly Val Tyr Asn Tyr Gly Leu Arg Ala 
35 40 45 

lie Ser Asn Val Lys Asp Asn Lys Lys Asp Leu Asn Arg Thr Phe Ser 
50 55 60 

Ser Leu Phe Tyr Leu Cys lie Ala Cys Thr lie Leu Thr Thr Ala Val 
65 70 75 80 

Tyr He Leu Ala Tyr Pro Leu Phe Phe Thr Asp Asn Pro He Val Lys 
85 90 95 

Lys Val Tyr Leu Val Met Gly He Gin Leu He Ala Gin He Phe Ser 
100 105 110 

He Glu Trp Val Asn Glu Ala Leu Glu Asn Tyr Ser Phe Leu Phe Tyr 
115 120 125 

Lys Thr Ala Phe lie Arg He Leu Met Leu Val Ser He Phe Leu Phe 
130 135 140 

Val Lys Asn Glu His Asp He Val Val Tyr Thr Leu Val Met Ser Leu 
145 150 155 160 

Ser Thr Leu He Asn Tyr Leu He Ser Tyr Phe Trp He Lys Arg Asp 
165 170 175 

He Lys Leu Val Lys He His Leu Ser Asp Phe Lys Pro Leu Phe Leu 
180 185 190 

Pro Leu Thr Ala Met Leu Val Phe Ala Asn Ala Asn Met Leu Phe Thr 
195 200 205 

Phe Leu Asp Arg Leu Phe Leu Val Lys Thr Gly He Asp Val Asn Val 
210 215 220 

Ser Tyr Tyr Thr He Ala Gin Arg He Val Thr Val He Ala Gly Val 
225 230 235 240 

Val Thr Gly Ala He Gly Val Ser Val Pro Arg Leu Ser Tyr Tyr Leu 
245 250 255 

Gly Lys Gly Asp Lys Glu Ala Tyr Val Ser Leu Val Asn Arg Gly Ser 
260 265 270 

Arg He Phe Asn Phe Phe He He Pro Leu Ser Phe Gly Leu Met Val 
275 280 285 

Leu Gly Pro Asn Ala He Leu Leu Tyr Gly Ser Glu Lys Tyr He Gly 
290 295 300 

Gly Gly He Leu Thr Ser Leu Phe Ala Phe Arg Thr He He Leu Ala 
305 310 315 320 

Leu Asp Thr He Leu Gly Ser Gin He Leu Phe Thr Asn Gly Tyr Glu 
325 330 335 

Lys Arg He Thr Val Tyr Thr Val Phe Ala Gly Leu Leu Asn Leu Gly 
340 345 350 

Leu Asn Ser Leu Leu Phe Phe Asn His He Val Ala Pro Glu Tyr Tyr 
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355 360 365 

Leu Leu Thr Thr Met Leu Ser Glu Thr Ser Leu Leu Val Phe Tyr lie 
370 375 380 

5 

lie Phe lie His Arg Lys Gin Leu He His Leu Gly His He Phe Ser 
385 390 395 400 

Tyr Thr Val Arg Tyr Ser Leu Phe Ser Leu Ser Phe Val Ala He Tyr 
10 405 410 415 

Phe Leu He Asn Phe Val Tyr Pro Val Asp Met Val He Asn Leu Pro 
420 425 430 

15 Phe Leu He Asn Thr Gly Leu He Val Leu Leu Ser Ala He Ser Tyr 

435 440 445 



20 



25 



35 



40 



55 



He Ser Leu Leu Val Phe Thr Lys Asp Ser He Phe Tyr Glu Phe Leu 
450 455 460 

Asn His Val Leu Ala Leu Lys Asn Lys Phe Lys Lys Ser 
465 470 475 

(2) INFORMATION FOR SEQ ID NO: 196: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 148 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
30 (D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 



Phe Pro He Asp Arg Phe Asp Asp Pro Lys Val He Asp Thr Cys Tyr 
15 10 15 

Lys Leu Glu Ser Phe Lys Leu Leu Ser Phe Ser Lys His Lys Asn He 
45 20 25 30 

Val Tyr Lys Asp Ser Leu Leu Lys Asp Trp He Arg Thr Ala Phe Trp 
35 40 45 

50 Leu Leu Leu Arg Pro Val Ser Pro Arg Tyr Phe Ala Asn Lys He Glu 

50 55 60 



Lys Glu He Gin Lys Tyr Ser Arg Glu Asn Gly Gin Tyr Met Ala Phe 

65 70 75 80 

lie Pro Ser Lys Phe Lys Glu Lys Glu Val Phe Pro Ser Gly Thr Phe 
85 90 95 



Asp Lys Thr He Asp Leu Pro Phe Glu Asn Leu Ser Leu Pro Ala Pro 
60 100 105 110 
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Glu Lys Phe Asp Thr He Leu Thr Gin Phe Tyr Gly Asp Tyr Met Thr 
115 120 125 

Leu Pro Pro Glu Glu Lys Arg Phe Tyr Ser His Glu Phe His Ala Tyr 
130 135 140 

Lys Leu Glu Asp 
145 

(2) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 280 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

Met Asn Phe Thr Leu He Asn Trp Arg He Arg Met Gin Tyr Leu Glu 
15 10 15 

Lys Lys Glu He Lys Glu He Gin Leu Ala Leu Leu Asp Tyr He Asp 
20 25 30 

Glu Thr Cys Lys Lys His Asp He Pro Tyr Phe Leu Ser Tyr Gly Thr 
35 40 45 

Met Leu Gly Ala He Arg His Lys Gly Met He Pro Trp Asp Asp Asp 
50 55 60 

He Asp He Ser Leu Tyr Arg Glu Asp Tyr Glu Arg Leu Leu Lys He 
65 70 75 80 

He Glu Glu Glu Asn His Pro Arg Tyr Lys Val Leu Ser Tyr Asp Thr 
85 90 95 

Ser Ser Trp Tyr Phe His Asn Phe Ala Ser He Leu Asp Thr Ser Thr 
100 105 HO 

Val He Glu Asp His Val Lys Tyr Lys Arg His Asp Thr Ser Leu Phe 
115 120 125 

He Asp Val Phe Pro He Asp Arg Phe Thr Asp Leu Ser He Val Asp 
130 135 140 

Lys Ser Tyr Lys Tyr Val Ala Leu Arg Gin Leu Ala Tyr He Lys Lys 
145 150 155 160 

Ser Arg Ala Val His Gly Asp Ser Lys Leu Lys Asp Phe Leu Arg Leu 
165 170 175 

Cys Ser Trp Tyr Ala Leu Arg Phe Val Asn Pro Arg Tyr Phe Tyr Lys 
180 185 190 
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30 



35 



40 



45 



50 



55 



Lys lie Asp Gin Leu Val Lys Asn Ala Val Thr Asn Thr Pro Gin Tyr 
195 200 205 

Glu Gly Gly Val Gly lie Gly Lys Glu Gly Met Lys Glu lie Phe Pro 
210 215 220 

Val Asp Thr Phe Lys Glu Leu lie Leu Thr Glu Phe Glu Gly Arg Met 
225 230 235 240 

Leu Pro Val Pro Lys Lys Tyr Asp Gin Phe Leu Thr Gin Met Tyr Gly 
245 250 255 

Asp Tyr Met Thr Pro Pro Ser Lys Glu Met Gin Glu Trp Tyr Ser His 
260 265 270 

Ser lie Lys Ala Tyr Arg Lys Asn 



(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 223 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
<D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

Lys Gly Phe lie Pro Trp Asp Asp Asp Leu Asp Phe Phe Met Pro Arg 
15 10 15 

Lys Asp Tyr Glu Lys Leu Ala Glu Leu Trp Pro Arg Tyr Ala Asp Glu 
20 25 30 

Arg Tyr Phe Leu Ser Lys Ser His Lys Asp Phe Val Asp Arg Asn Leu 
35 40 45 

Phe lie Thr He Arg Asp Lys Lys Thr Thr Cys He Lys Pro Tyr Gin 
50 55 60 

Gin Asp Leu Asp Leu Pro His Gly Leu Ala Leu Asp Val Leu Pro Leu 
65 70 75 80 

Asp Tyr Tyr Pro Lys Asn Pro Ala Glu Arg Lys Lys Gin Val Arg Trp 
85 90 95 

Ala Leu He Tyr Ser Leu Phe Cys Ala Gin Thr He Pro Glu Lys His 
100 105 HO 

Gly Asp Leu Met Lys Trp Gly Ser Arg He Leu Leu Gly Leu Thr Pro 
115 120 125 

Lys Ser Leu Arg Tyr Arg He Trp Lys Lys Ala Glu Lys Glu Met Thr 



275 



280 
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130 135 140 

Lys Tyr Asp Leu Ala Asp Cys Asp Gly lie Thr Glu Leu Cys Ser Gly 
145 150 155 160 

5 

Pro Gly Tyr Met Arg Asn Lys Tyr Pro lie Thr Ser Phe Glu Asp Asn 
165 170' 175 

Leu Phe Leu Pro Phe Glu Gly Thr Glu Met Pro lie Pro lie Gly Tyr 
10 180 185 190 

Asp Val Tyr Leu Arg Thr Ala Phe Gly Asp Tyr Met Thr Pro Pro Pro 
195 200 205 

15 Ala Asp Lys Gin Val Pro His His Asp Thr Val Thr Ala Asp Met 

210 215 220 

(2) INFORMATION FOR SEQ ID NO: 199: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 835 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

25 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
30 (iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:199: 

35 Gly Phe Asp Asp Tyr His Pro Ser Cys Gly Arg He Leu Ser Val Val 

15 10 15 



40 



55 



Thr Ser Gly Gly Glu Asp He Ala Asp Ala He He He Leu Ala Val 
20 25 30 

Val He He Asn Ala Ala Phe Gly Val Tyr Gin Glu Gly Lys Ala Glu 
35 40 45 



Glu Ala He Glu Ala Leu Lys Ser Met Ser Ser Pro Val Ala Arg Val 
45 50 55 60 

Leu Arg Asp Gly His Met Ala Glu He Asp Ser Lys Glu Leu Val Pro 
65 70 75 80 

50 Gly Asp He Val Ala Leu Glu Ala Gly Asp Val Val Pro Ala Asp Leu 

85 90 95 



Arg Leu He Glu Ala Asn Ser Leu Lys He Glu Glu Ala Ala Leu Thr 

100 105 110 

Gly Glu Ser Val Pro Val Glu Lys Asp Leu Ser Val Asp Leu Ala Thr 

115 120 125 



Asp Ala Gly He Gly Asp Arg Val Asn Met Ala Phe Gin Asn Ser Asn 
60 130 135 140 
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Val Thr Tyr Gly Arg Gly Met Gly Val Val Val Asn Thr Gly Met Tyr 
145 150 155 160 

Thr Glu Val Gly His He Ala Gly Met Leu Gin Asp Ala Asp Glu Thr 
165 170 175 

Asp Thr Pro Leu Lys Gin Asn Leu Asn Asn Leu Ser Lys Val Leu Thr 
180 185 190 

Tyr Ala He Leu Val He Ala Leu Val Thr Phe Val Val Gly Val Phe 
195 200 205 

He Gin Gly Lys Asn Pro Leu Gly Glu Leu Leu Thr Ser Val Ala Leu 
210 215 220 

Ala Val Ala Ala He Pro Glu Gly Leu Pro Ala He Val Thr He Val 
225 230 235 240 

Leu Ser Leu Gly Thr Gin Val Leu Ala Lys Arg His Ser He Val Arg 
245 250 255 

Lys Leu Pro Ala Val Glu Thr Leu Gly Ser Thr Glu He He Ala Ser 
260 265 270 

Asp Lys Thr Gly Thr Leu Thr Met Asn Lys Met Thr Val Glu Lys Val 
275 280 285 

Phe Tyr Asp Ala Val Leu His Asp Ser Ala Asp Asp He Glu Leu Gly 
290 295 300 

Leu Glu Met Pro Leu Leu Arg Ser Val Val Leu Ala Asn Asp Thr Lys 
305 310 315 320 

He Asp Val Glu Gly Asn Leu He Gly Asp Pro Thr Glu Thr Ala Phe 
325 330 335 

He Gin Tyr Ala Leu Asp Lys Gly Tyr Asp Val Lys Gly Phe Leu Glu 
340 345 350 

Lys Tyr Pro Arg Val Ala Glu Leu Pro Phe Asp Ser Asp Arg Lys Leu 
355 360 365 

Met Ser Thr Val His Pro Leu Pro Asp Ser Arg Phe Leu Val Ala Val 
370 375 380 

Lys Gly Ala Pro Asp Gin Leu Leu Lys Arg Cys Leu Leu Arg Asp Lys 
385 390 395 400 

Ala Gly Asp He Ala Pro He Asp Glu Lys Val Thr Asn Leu He His 
405 410 415 

Thr Asn Asn Ser Glu Met Ala His Gin Ala Leu Arg Val Leu Ala Gly 
420 425 430 

Ala Tyr Lys He He Asp Ser He Pro Glu Asn Leu Thr Ser Glu Glu 
435 440 445 

Leu Glu Asn Asp Leu He Phe Thr Gly Leu He Gly Met He Asp Pro 
450 455 460 

Glu Arg Pro Glu Ala Ala Glu Ala Val Arg Val Ala Lys Glu Ala Gly 
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465 



470 



475 



480 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



He Arg Pro He Met He Thr Gly Asp His Gin Asp Thr Ala Glu Ala 
485 490 495 

He Ala Lys Arg Leu Gly He lie Asp Ala Asn Asp Thr Glu Gly His 
500 505 510 

Val Leu Thr Gly Ala Glu Leu Asn Glu Leu Ser Asp Glu Glu Phe Glu 
515 520 525 

Lys Val Val Gly Gin Tyr Ser Val Tyr Ala Arg Val Ser Pro Glu His 
530 535 540 

Lys Val Arg He Val Lys Ala Trp Gin Lys Gin Gly Lys Val Val Ala 
545 550 555 560 

Met Thr Gly Asp Gly Val Asn Asp Ala Pro Ala Leu Lys Thr Ala Asp 
565 570 575 

lie Gly He Gly Met Gly He Thr Gly Thr Glu Val Ser Lys Gly Ala 
580 585 590 

Ser Asp Met He Leu Ala Asp Asp Asn Phe Ala Thr He He Val Ala 
595 600 605 

Val Glu Glu Gly Arg Lys Val Phe Ser Asn lie Gin Lys Thr He Gin 
610 615 620 

Tyr Leu Leu Ser Ala Asn Thr Ala Glu Val Leu Thr lie Phe Leu Ser 
625 630 635 640 

Thr Leu Phe Gly Trp Asp Val Leu Gin Pro Val His Leu Leu Trp lie 
645 650 655 

Asn Leu Val Thr Asp Thr Phe Pro Ala lie Ala Leu Gly Val Glu Pro 
660 665 670 

Ala Glu Pro Gly Val Met Asn His Lys Pro Arg Gly Arg Lys Ala Ser 
675 680 685 

Phe Phe Ser Gly Gly Val Leu Ser Ser lie lie Tyr Gin Gly Val Leu 
690 695 700 

Gin Ala Ala Leu Val Met Ser Val Tyr Gly Leu Ala lie Ala Tyr Pro 
705 710 715 720 

Val His Val Gly Asp Asn His Ala lie His Ala Asp Ala Leu Thr Met 
725 730 735 

Ala Phe Ala Thr Leu Gly Leu lie Gin Leu Phe His Ala Tyr Asn Val 
740 745 750 

Lys Ser Val Tyr Gin Ser lie Leu Thr Val Gly Pro Phe Lys Ser Lys 
755 760 765 

Thr Phe Asn Trp Ser He Leu Val Ser Phe lie Leu Leu Met Ala Thr 
770 775 780 

lie Val Val Glu Pro Leu Glu Gly lie Phe His Val Thr Lys Leu Asp 



785 



790 



795 



800 



10 



20 



25 



30 



45 



60 
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Leu Ser Gin Trp Gly He Val Met Ala Gly Ser Phe Ser Met He He 
805 810 815 

He Val Glu lie Val Lys Phe He Gin Arg Lys Leu Gly Phe Asp Lys 
820 825 830 

Asn Ala He 
835 



(2) INFORMATION FOR SEQ ID NO: 200: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 525 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:200: 



Gly Phe He Leu Phe Phe Val Leu Leu Gly Ala Val Phe Glu Glu Lys 
15 10 15 

Met Arg Lys Asn Thr Ser Gin Ala Val Glu Lys Leu Leu Asp Leu Gin 
20 25 30 

Ala Lys Thr Ala Glu Val Leu Ser Asp Asp Ser Tyr Val Gin Val Pro 
35 35 40 45 

Leu Glu Gin Val Lys Val Gly Asp Leu He Arg Val Arg Pro Gly Glu 
50 55 60 

40 Lys He Ala Val Asp Gly Val Val Val Glu Gly Val Ser Ser He Asp 

65 70 75 80 



Glu Ser Met Val Thr Gly Glu Ser Leu Pro Val Asp Lys Thr Val Gly 
85 90 95 

Asp Thr Val He Gly Ser Thr He Asn His Ser Gly Thr Leu Val Phe 
100 105 110 



Arg Ala Glu Lys Val Gly Ser Glu Thr Val Leu Ala Gin He Val Asp 

50 115 120 125 

Phe Val Lys Lys Ala Gin Thr Ser Arg Ala Pro He Gin Asp Leu Thr 
130 135 140 

55 Asp Lys He Ser Gly He Phe Val Pro Val Val Val He Leu Gly He 

145 150 155 160 



Met Thr Phe Trp Val Trp Phe Val Leu Leu Arg Asp Ser Val Val Val 
165 170 175 

Leu Gly Ala Ser Phe Val Ser Ser Leu Leu Tyr Gly Val Ala Val Leu 
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180 185 190 

lie lie Ala Cys Pro Cys Ala Leu Gly Leu Ala Thr Pro Thr Ala Leu 
195 200 205 

Met Val Gly Thr Gly Arg Ser Ala Lys Met Gly Val Leu Leu Lys Asn 
210 215 220 

Gly Thr Val Leu Gin Glu He Gin Lys Val Gin Thr Leu Val Phe Asp 
225 230 235 240 

Lys Thr Gly Thr Leu Thr Glu Gly Lys Pro Val Val Thr Asp He He 
245 250 255 

Gly Asp Glu Val Glu Val Phe Gly Leu Ala Ala Ser Leu Glu Asp Ala 
260 265 270 

Ser Gin His Pro Leu Ala Glu Ala He Val Lys Arg Ala Ser Glu Ala 
275 280 285 

Gly Leu Glu Phe Gin Thr Val Glu Asn Phe Gin Ala Leu His Gly Lys 
290 295 300 

Gly Val Ser Gly Arg He Asn Gly Lys Gin Val Leu Leu Gly Asn Ala 
305 310 315 320 

Lys Met Leu Asp Gly Met Asp He Ser Asn Thr Tyr Gin Asp Lys Leu 
325 330 335 

Glu Glu Leu Glu Lys Glu Ala Lys Thr Val Val Phe Leu Ala Val Asp 
340 345 350 

Asn Glu He Lys Gly Leu Leu Ala Leu Gin Asp He Pro Lys Glu Asn 
355 360 365 

Ala Lys Leu Ala He Ser Gin Leu Lys Lys Arg Gly Leu Arg Thr Val 
370 375 380 

Met Leu Thr Gly Asp Asn Ala Gly Val Ala Arg Ala He Ala Asp Gin 
385 390 395 400 

He Gly He Glu Glu Val He Ala Gly Val Leu Pro Glu Glu Lys Ala 
405 410 415 

His Glu He His Lys Leu Gin Ala Ala Gly Lys Val Ala Phe Val Gly 
420 425 430 

Asp Gly He Asn Asp Ala Pro Ala Leu Ser Val Ala Asp Val Gly He 
435 440 445 

Ala Met Gly Ala Gly Thr Asp He Ala He Glu Ser Ala Asp Leu Val 
450 455 460 

Leu Thr Thr Asn Asn Leu Leu Gly Val Val Arg Ala Phe Asp Met Ser 
465 470 475 480 

Lys Lys Thr Phe His Arg He Leu Leu Asn Leu Phe Trp Ala Phe He 
485 490 495 

Tyr Asn Val Val Gly He Pro He Ala Ala Gly Val Phe Ser Gly Val 
500 505 510 
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Gly Trp Leu Ser Thr Gin lie Gly Lys Ala Ser Pro Met 
515 520 525 

5 (2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 362 amino acids 
<B) TYPE: amino acid 
10 (C) STRANDEDNES S : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

15 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:201: 

Asn Asp lie lie Glu Phe Met Asp Lys Asn Lys lie Met Gly Leu Thr 
15 10 15 

25 Gin Arg Glu Val Lys Glu Arg Gin Ala Glu Gly Leu Val Asn Asp Phe 

20 25 30 



30 



45 



60 



Thr Ala Ser Ala Ser Thr Ser Thr Trp Gin lie Val Lys Arg Asn Val 
35 40 45 

Phe Thr Leu Phe Asn Ala Leu Asn Phe Ala lie Ala Leu Ala Leu Ala 
50 55 60 



Phe Val Gin Ala Trp Ser Asn Leu Val Phe Phe Ala Val He Cys Phe 

35 65 70 75 80 

Asn Ala Phe Ser Gly He Val Thr Glu Leu Arg Ala Lys His Met Val 

85 90 95 

40 Asp Lys Leu Asn Leu Met Thr Lys Glu Lys Val Lys Thr He Arg Asp 

100 105 110 



Gly Gin Glu Val Ala Leu Asn Pro Glu Glu Leu Val Leu Gly Asp Val 
115 120 125 

He Arg Leu Ser Ala Gly Glu Gin He Pro Ser Asp Ala Leu Val Leu 
130 135 140 



Glu Gly Phe Ala Glu Val Asn Glu Ala Met Leu Thr Gly Glu Ser Asp 
50 145 150 155 160 

Leu Val Gin Lys Glu Val Asp Gly Leu Leu Leu Ser Gly Ser Phe Leu 
165 170 175 

55 Ala Ser Gly Ser Val Leu Ser Gin Val His His Val Gly Ala Asp Asn 

180 185 190 



Tyr Ala Ala Lys Leu Met Leu Glu Ala Lys Thr Val Lys Pro He Asn 
195 200 205 

Ser Arg He Met Lys Ser Leu Asp Lys Leu Ala Gly Phe Thr Gly Lys 
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210 215 220 

lie lie lie Pro Phe Gly Leu Ala Leu Leu Leu Glu Ala Leu Leu Leu 
225 230 235 240 

5 

Lys Gly Leu Pro Leu Lys Ser Ser Val Val Asn Ser Ser Thr Ala Leu 
245 250 255 

Leu Gly Met Leu Pro Lys Gly lie Ala Leu Leu Thr lie Thr Ser Leu 
10 260 265 270 

Leu Thr Ala Val lie Lys Leu Gly Leu Lys Lys Val Leu Val Gin Glu 
275 280 285 

15 Met Tyr Ser Val Glu Thr Leu Ala Arg Val Asp Met Leu Cys Leu Asp 

290 295 300 



20 



55 



Lys Thr Gly Thr He Thr Gin Gly Lys Met Gin Val Glu Ala Val Leu 

305 310 315 320 

Pro Leu Thr Glu Thr Tyr Gly Glu Glu Ala He Ala Ser He Leu Thr 

325 330 335 



Ser Tyr Met Ala His Ser Glu Asp Lys Asn Pro Thr Ala Gin Ala He 
25 340 345 350 

Arg Gin Arg Leu Trp Glu Met Leu Leu He 
355 360 

30 (2) INFORMATION FOR SEQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 243 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

40 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:202: 

Ala Ser Asn He Met Phe Met Leu Asp Leu Gly Asn His Leu Asp Gin 
15 10 15 

50 Trp Ser Leu Lys Lys Thr Ala Thr Asp Leu Glu Gin Ser Leu Leu Ala 

20 25 30 



Lys Glu Ser Asp Val Phe Leu Val Gin Gly Asp Thr Val Val Ser He 
35 40 45 

Lys Ser Ser Asp Val Gin He Gly Asp Val Leu He Leu Ser Gin Gly 
50 55 60 



Asn Glu He Leu Phe Asp Gly Gin Val Val Ser Gly Leu Gly Met Val 
60 65 70 75 80 
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Asn Glu Ser Ser Leu Thr Gly Glu Ser Phe Pro Val Glu Lys Arg Glu 
85 90 95 

Ser Asp Leu Val Cys Ala Asn Thr Val Leu Glu Thr Gly Glu Leu Arg 
100 105 110 

lie Arg Val Thr Asp Asn Gin Met Asn Ser Arg lie Leu Gin Leu lie 
115 120 125 

Glu Leu Met Lys Lys Ser Glu Glu Asn Lys Lys Thr Lys Gin Arg Tyr 
130 135 140 

Phe lie Lys Met Ala Asp Lys Val Val Lys Tyr Asn Phe Leu Gly Ser 
145 150 155 160 

Gly Leu Thr Tyr Leu Leu Thr Gly Ser Phe Ser Lys Ala He Ser Phe 
165 170 175 

Leu Leu Val Asp Phe Ser Cys Ala Leu Lys He Ser Thr Pro Val Ala 
180 185 190 

Tyr Leu Thr Val He Lys Val Gly Leu Asn Arg Glu Met Val He Lys 
195 200 205 

Asp Gly Asp Val Leu Glu Lys Tyr Leu Val Val Asp Thr Phe Leu Phe 
210 215 220 

Asp Lys Thr Gly Pro He Thr Thr Ser Tyr Pro He Val Glu Lys Val 
225 230 235 240 

Tyr Pro Leu 



(2) INFORMATION FOR SEQ ID NO:203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 307 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
<iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:203: 

Lys Gin He Glu Val Val Asp Lys Asp Asn Lys Ser Glu Thr Ala Glu 
15 10 15 

Ala Ala Ser Val Thr Thr Asn Leu Val Thr Gin Ser Lys Val Ser Ala 
20 25 30 

Val Val Gly Pro Ala Thr Ser Gly Ala Thr Ala Ala Ala Val Ala Asn 
35 40 45 

Ala Thr Lys Ala Gly Val Pro Leu He Ser Pro Ser Ala Thr Gin Asp 
50 55 60 
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Gly Leu Thr Lys Gly Gin Asp Tyr Leu Phe lie Gly Thr Phe Gin Asp 
65 70 75 80 

Ser Phe Gin Gly Lys lie lie Ser Asn Tyr Val Ser Glu Lys Leu Asn 
85 90 95 

Ala Lys Lys Val Val Leu Tyr Thr Asp Asn Ala Ser Asp Tyr Ala Lys 
100 105 110 

Gly He Ala Lys Ser Phe Arg Glu Ser Tyr Lys Gly Glu He Val Ala 
115 120 125 

Asp Glu Thr Phe Val Ala Gly Asp Thr Asp Phe Gin Ala Ala Leu Thr 
130 135 140 

Lys Met Lys Gly Lys Asp Phe Asp Ala He Val Val Pro Gly Tyr Tyr 
145 150 155 160 

Asn Glu Ala Gly Lys He Val Asn Gin Ala Arg Gly Met Gly He Asp 
165 170 175 

Lys Pro He Val Gly Gly Asp Gly Phe Asn Gly Glu Glu Phe Val Gin 
180 185 190 

Gin Ala Thr Ala Glu Lys Ala Ser Asn He Tyr Phe He Ser Gly Phe 
195 200 205 

Ser Thr Thr Val Glu Val Ser Ala Lys Ala Lys Ala Phe Leu Asp Ala 
210 215 220 

Tyr Arg Ala Lys Tyr Asn Glu Glu Pro Ser Thr Phe Ala Ala Leu Ala 
225 230 235 240 

Tyr Asp Ser Val His Leu Val Ala Asn Ala Ala Lys Gly Ala Lys Asn 
245 250 255 

Ser Gly Glu He Lys Asn Asn Leu Ala Lys Thr Lys Asp Phe Glu Gly 
260 265 270 

Val Thr Gly Gin Thr Ser Phe Asp Ala Asp His Asn Thr Val Lys Thr 
275 280 285 

Ala Tyr Met Met Thr Met Asn Asn Gly Lys Val Glu Ala Ala Glu Val 
290 295 300 

Val Lys Pro 
305 

(2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 289 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 



(iii) HYPOTHETICAL: NO 
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<iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:204: 

Met Leu Gin Gin Leu Val Asn Gly Leu lie Leu Gly Ser Val Tyr Ala 
15 10 15 

Leu Leu Ala Leu Gly Tyr Thr Met Val Tyr Gly lie lie Lys Leu lie 
20 25 30 

Asn Phe Ala His Gly Asp lie Tyr Met Met Gly Ala Phe lie Gly Tyr 
35 40 45 

Phe Leu lie Asn Ser Phe Gin Met Asn Phe Phe Val Ala Leu lie Val 
50 55 60 

Ala Met Leu Ala Thr Ala He Leu Gly Val Val He Glu Phe Leu Ala 
65 70 75 80 

Tyr Arg Pro Leu Arg His Ser Thr Arg He Ala Val Leu He Thr Ala 
85 90 95 

He Gly Val Ser Phe Leu Leu Glu Tyr Gly Met Val Tyr Leu Val Gly 
100 105 110 

Ala Asn Thr Arg Ala Phe Pro Gin Ala He Gin Thr Val Arg Tyr Asp 
115 120 125 

Leu Gly Pro He Ser Leu Thr Asn Val Gin Leu Met lie Leu Gly lie 
130 135 140 

Ser Leu He Leu Met lie Leu Leu Gin Val lie Val Gin Lys Thr Lys 
145 150 155 160 

Met Gly Lys Ala Met Arg Ala Val Ser Val Asp Ser Asp Ala Ala Gin 
165 170 175 

Leu Met Gly He Asn lie Asn Arg Thr He Ser Phe Thr Phe Ala Leu 
180 185 190 

Gly Ser Ala Leu Ala Gly Ala Ala Gly Val Leu lie Ala Leu Tyr Tyr 
195 200 205 

Asn Ser Leu Glu Pro Leu Met Gly Val Thr Pro Gly Leu Lys Ser Phe 
210 215 220 

Val Ala Ala Val Leu Gly Gly lie Gly lie lie Pro Gly Ala Ala Leu 
225 230 235 240 

Gly Gly Phe Val lie Gly Leu Leu Glu Thr Phe Ala Thr Ala Phe Gly 
245 250 255 

Met Ser Asp Phe Arg Asp Ala lie Val Tyr Gly lie Leu Leu Leu lie 
260 265 270 

Leu lie Val Arg Pro Ala Gly lie Leu Gly Lys Asn Val Lys Glu Lys 
275 280 285 

Val 
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(2) INFORMATION FOR SEQ ID NO:205: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 106 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:205: 

Ser Gin Asp Gin Thr Trp Tyr Ala Leu Ala Tyr Asp Gly Ala Glu Val 
15 10 15 

He Gly Phe Leu Thr Val Gin Glu Thr Leu Phe Glu Ala Glu Val Leu 
20 25 30 

Gin He Ala Val Lys Gly Ala Tyr Gin Gly Gin Gly lie Ala Ser Ala 
35 40 45 

Leu Phe Ala Gin Leu Pro Thr Asp Lys Glu He Phe Leu Glu Val Arg 
50 55 60 

Gin Ser Asn Gin Arg Ala Gin Ala Phe Tyr Lys Lys Glu Lys Met Ala 
65 70 75 B0 

Val He Ala Glu Arg Lys Ala Tyr Tyr His Asp Pro Val Glu Asp Ala 
85 90 95 

He He Met Lys Arg Glu He Asp Glu Gly 
100 105 

(2) INFORMATION FOR SEQ ID NO: 206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:206: 

Lys Thr Leu Lys Gly His Gly Gin Phe Leu His Ala Lys Thr Leu Gly 
15 10 15 

Phe Thr His Pro Arg Thr Gly Lys Thr Leu Glu Phe Lys Ala Asp He 
20 25 30 
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Pro Glu lie Phe Lys Glu Thr Leu Glu Arg Leu Arg Lys 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 207: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 163 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 

Arg Glu Met Val Val His Pro Ser Ala Gly His Thr Ser Gly Thr Leu 
15 10 15 

Val Asn Ala Leu Met Tyr His lie Lys Asp Leu Ser Gly lie Asn Gly 
20 25 30 

Val Leu Arg Pro Gly He Val His Arg He Asp Lys Asp Thr Ser Gly 
35 40 45 

Leu Leu Met He Ala Lys Asn Asp Asp Ala His Leu Val Leu Ala Gin 
50 55 60 

Glu Leu Lys Asp Lys Lys Ser Leu Arg Lys Tyr Trp Ala He Val His 
65 70 75 80 

Gly Asn Leu Pro Asn Asp Arg Gly Val He Glu Ala Pro He Gly Arg 
85 90 95 

Ser Glu Lys Asp Arg Lys Lys Gin Ala Val Thr Ala Lys Gly Lys Pro 
100 105 110 

Ala Val Thr Arg Phe His Val Leu Glu Arg Phe Gly Asp Tyr Ser Leu 
115 120 125 

Val Glu Leu Gin Leu Glu Thr Gly Arg Thr His Gin He Arg Val His 
130 135 140 

Met Ala Tyr He Gly His Pro Val Ala Gly Asp Glu Val Tyr Gly Pro 
145 150 155 160 

Ala Arg Leu 

(2) INFORMATION FOR SEQ ID NO: 208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 224 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 
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(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:208: 

Leu Gly Thr Arg Gly Ser Ser Arg Val Asp Asn He Asn Leu Gin Val 
15 10 15 

Asp Glu Arg Asp Arg He Ala Leu Val Gly Lys Asn Gly Ala Gly Lys 
20 25 30 

Ser Thr Leu Leu Lys He Leu Val Gly Glu Glu Glu Pro Thr Ser Gly 
35 40 45 

Glu He Asn Lys Lys Lys Asp He Ser Leu Ser Tyr Leu Ala Gin Asp 
50 55 60 

Ser Arg Phe Glu Ser Glu Asn Thr He Tyr Asp Glu Met Leu His Val 
65 70 75 80 

Phe Asn Asp Leu Arg Arg Thr Glu Arg Gin Leu Arg Gin Met Glu Leu 
85 90 95 

Glu Met Gly Glu Lys Ser Gly Glu Asp Leu Asp Lys Leu Met Ser Asp 
100 105 HO 

Tyr Asp Arg Leu Ser Glu Asn Phe Arg Gin Ala Gly Gly Phe Thr Tyr 
115 120 125 

Glu Ala Asp He Arg Ala He Leu Asn Gly Phe Lys Phe Asp Glu Ser 
130 135 140 

Met Trp Gin Met Lys He Ala Glu Leu Ser Gly Gly Gin Asn Thr Arg 
145 150 155 160 

Leu Ala Leu Ala Lys Met Leu Leu Glu Lys Pro Asn Leu Leu Val Leu 
165 170 175 

Asp Glu Pro Thr Asn His Leu Asp He Glu Thr He Ala Trp Leu Glu 
180 185 190 

Asn Tyr Leu Val Acn Tyr Ser Gly Ala Leu He He Val Ser His Asp 
195 200 205 

Arg Tyr Phe Leu Asp Lys Val Ala Thr He Thr Leu Asp Leu Thr Ser 
210 215 220 

(2) INFORMATION FOR SEQ ID NO:209: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 
<D) TOPOLOGY: not relevant 
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(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
5 (iv) ANTI-SENSE: NO 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 09: 

Ser Thr Thr His His Leu Leu Val Lys Lys Val Asn Gly Leu Leu Val 
15 10 15 

15 Arg Trp Lys Asn Ala Cys Arg Gin Asn Cys Lys Gin Thr Phe Xaa Phe 

20 25 30 



20 



35 



50 



Val Leu Thr Gin Leu lie His Ala Asp Lys Trp Thr Val Ser Gly Arg 
35 40 45 

Gly Glu Leu His Leu Ser lie Leu lie Glu Thr Met Arg Arg Glu Gly 
50 55 60 



Tyr Glu Leu Gin Val Ser Arg Pro Glu Val He Val Lys Glu He Asp 
25 65 70 75 80 

Gly Val Lys Cys Glu Pro Phe Glu Arg Val Gin He Asp Thr Pro Glu 

85 90 95 

30 Glu Tyr Gin Gly Ser Val He Gin Ser Leu Ser Glu Arg Lys Gly Glu 

100 105 110 



Met Leu Asp Met He Ser Thr Gly Asn Gly Gin Thr Arg Leu Val Phe 
115 120 125 

Leu Val Pro Ala Arg Gly Leu Xaa Trp He Leu Asn Val Leu Val Asn 
130 135 140 



Asp Ser Trp Leu Arg Tyr His Glu Pro Tyr Leu Arg Pro lie Leu Ala 

40 145 150 155 160 

He Asp Ser Arg Gly Asn Trp Trp Thr Ser Pro Trp Cys Pro Cys Phe 
165 170 175 

45 Tyr Arg Cys Trp Gly Tyr Asn Leu Leu Asn Leu Leu Leu Ser Thr Leu 

180 185 190 



(2) INFORMATION FOR SEQ ID NO: 210: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
55 (D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 



60 



(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:210: 

Met Phe Gly Phe Phe Lys Lys Asp Lys Ala Val Glu Val Glu Val Pro 
15 10 15 

Thr Gin Val Pro Ala His He Gly He He Met Asp Gly Asn Gly Arg 
20 25 30 

Trp Ala Lys Lys Arg Met Gin Pro Arg Val Phe Gly His Lys Ala Gly 
35 40 45 

Met Glu Ala Leu Gin Thr Val Thr Lys Ala Ala Asn Lys Leu Gly Val 
50 55 60 

Lys Val He Thr Val Tyr Ala Phe Ser Thr Glu Asn Trp Thr Arg Pro 
65 70 75 80 

Asp Gin Glu Val Lys Phe Xaa Met Asn Leu Pro Val Glu Phe Tyr Asp 
85 90 95 

Asn Tyr Val Pro Glu Leu His Ala Asn Asn Val Lys He Gin Met He 
100 105 HO 

Gly Glu Thr Asp Arg Leu Pro Lys Gin Thr Phe Glu Ala Leu Thr Lys 
115 120 125 

Ala Glu Glu Leu Thr Lys Asn Asn Thr Gly Leu He Leu Asn Phe Ala 
130 135 140 

Leu Asn Tyr Gly Gly Arg Ala Glu He Thr Gin Ala Leu Lys Leu He 
145 150 155 160 

Ser Gin Asp Val Leu Asp Ala Lys He Asn Pro Gly Asp He Thr Glu 
165 170 175 

Glu Leu He Gly Asn Tyr Leu Phe Thr Gin His Leu Pro Lys Asp Leu 
180 185 190 

Arg Asp Pro Asp Leu He He Arg Thr Ser Gly Glu Leu Arg Leu Ser 
195 200 205 

Asn Phe Leu Pro Trp Gin Gly Ala Tyr Ser Glu Leu Tyr Phe Thr Asp 
210 215 220 

Thr Leu Trp Pro Asp Phe Asp Glu Ala Ala Leu Gin Glu Ala He Leu 
225 230 235 240 

Ala Tyr Asn Arg Arg His Arg Arg Phe Gly Gly Val 



) INFORMATION FOR SEQ ID NO: 2 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 



245 



250 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 

Val Glu Gin Lys Leu Arg Gly Arg Asn Glu Asn Glu lie Gin Ser Glu 
15 10 15 

Asp lie Gly Ser Leu Val Met Glu Glu Leu Ala Glu Leu Asp Glu lie 
20 25 30 

Thr Tyr Val Arg Phe Ala Ser Val Tyr Arg Ser Phe Lys Asp Val Ser 
35 40 45 

Glu Leu Glu Ser Leu Leu Gin Gin lie Thr Gin Ser Ser Lys Lys Lys 
50 55 60 

Lys Glu Arg 
65 

(2) INFORMATION FOR SEQ ID NO:212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:212: 

Val Asp Ser Arg Gin Ala Glu Glu Gly Asn Thr lie Arg Arg Arg Arg 
15 10 15 

Glu Cys Asp Glu Cys Gin His Arg Phe Thr Thr Tyr Glu Arg Val Glu 
20 25 30 

Glu Arg Thr Leu Val Val Val Lys Lys Asp Gly Thr Arg Glu Gin Phe 
35 40 45 

Ser Arg Asp Lys He Phe Asn Gly He He Arg Ser Ala Gin Lys Arg 
50 55 60 

Pro Val Ser Ser Asp Glu He Asn Met Val He 
65 70 75 

(2) INFORMATION FOR SEQ ID NO: 213: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 amino acids 
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(BJ TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:213: 

Phe Ala Gin Val Pro Lys Val Ala Gin Lys Val Met Lys Val Thr Lys 
1 5 10 15 

Ala Ala Gly Met Asn lie He Ser Asn Cys Glu Glu Val Ala Gly Gin 
20 25 30 

Thr Val Phe His Thr His Val His Leu Val Pro Arg Tyr Ser Ala Asp 
35 40 45 

Asp Asp Leu Lys He Asp Phe He Ala His Glu Thr Asp Phe Asp 
50 55 60 

(2) INFORMATION FOR SEQ ID NO:214: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214: 

Met Ser Asp Cys He Phe Cys Lys He He Ala Gly Glu He Pro Ala 
1 5 10 15 

Ser Lys Val Tyr Glu Asp Glu Gin Val Leu Ala Phe Leu Asp He Ser 
20 25 30 

Gin Val Thr Leu Gly His Thr Leu Val Val Pro Lys Glu His Tyr Ara 
35 40 45 

Asn Leu Leu Glu Met Asp Ala Thr Ser Ala Thr Asn Ser Leu Pro Lys 



50 
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60 



Tyr 
65 



Gin Lys 



(2) INFORMATION FOR SEQ ID NO: 215: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215: 



He Gin Ala Val Arg Asp Val Ser Phe Glu Val Asn Glu Gly Glu Val 
15 1 5 io 15 



20 
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Val Ser Leu He Gly Ala Asn Gly Ala Gly Lys Thr Thr He Leu Arg 
20 25 30 

Thr Leu Ser Gly Leu Val Arg Pro Ser Ser Gly Lys He Glu Phe Leu 
35 40 45 

Gly Gin Glu He Gin Lys Met Pro Ala Gin Lys He Val Ala Gly Gly 
50 55 60 

Leu Ser Gin Val Pro Glu Gly Arg His Val Phe Pro Gly Leu Thr Val 
65 70 75 80 



Met Glu Asn Leu Glu Met Gly Ala Phe Leu Lys Lys Asn Arg Glu Glu 
30 85 90 95 

Asn Gin Ala Asn Leu Lys Lys Val Phe Ser Arg Phe Pro Arg Leu Glu 
100 105 HO 

35 Glu Arg Lys Asn Gin Asp Ala Ala Thr Leu Ser Gly Gly Glu Gin Gin 

115 120 125 



Met Leu Ala Met Gly Arg Ala Leu Met Ser Thr Pro Lys Leu Leu Leu 
130 135 140 

Leu Asp Glu Pro Ser Met Gly Leu Ala Pro He Phe He Gin Glu He 
145 150 155 160 



Phe Asp He He Gin Asp He Gin Lys Gin Gly Thr Thr Val Leu Leu 
45 165 170 175 



He Glu Gin Asn Ala Asn Lys Aid Leu Ala He Ser Asp Arg Gly Tyr 
180 185 190 

Val Leu Glu Gin Gly Asn Arg Leu Ser Gly Thr Gly Lys Asp Ser Leu 
195 200 205 

He Arg Gly Val 
210 

(2) INFORMATION FOR SEQ ID NO: 2 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 
60 (B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 
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(D) TOPOLOGY: not relevant 
Ui) MOLECULE TYPE: peptide 
{iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:216: 

Leu Leu Ser Leu He Asp He Leu Val Asp Gly Arg Tyr Asp Arg Thr 
15 10 15 

Lys Arg Asn Leu Met Leu Gin Phe Arg Gly Ser Ser Asn Gin Arg He 
20 25 30 

He Asp Ser Arg Gly Ser Pro Gly Thr Glu Leu 



(2) INFORMATION FOR SEQ ID NO: 217: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 130 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 

Met Asn Asn Pro Lys Pro Gin Glu Trp Lys Ser Glu Glu Leu Ser Gin 



Gly Arg He He Asp Tyr Lys Ala Phe Asn Phe Val Asp Gly Glu Gly 
20 25 30 

Val Arg Asn Ser Leu Tyr Val Ser Gly Cys Met Phe His Cys Glu Gly 
35 40 45 

Cys Tyr Asn Val Ala Thr Trp Ser Phe Asn Ala Gly He Pro Tyr Thr 
50 55 60 

Ala Glu Leu Glu Glu Gin He Met Ala Asp Leu Ala Gin Pro Tyr Val 
65 70 75 80 

Gin Gly Leu Thr Leu Leu Gly Gly Glu Pro Phe Leu Asn Thr Gly He 
85 90 95 

Leu Leu Pro Leu Val Lys Arg He Arg Lys Glu Leu Pro Asp Lys Asp 
100 105 no 

He Trp Ser Trp Thr Gly Tyr Thr Trp Glu Glu Met He Pro Gly Asn 



115 



120 



125 
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Ser Arg 
130 

(2) INFORMATION FOR SEQ ID NO: 218: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 126 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
10 (D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



15 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 218: 

Met Val Asn His Phe Arg- lie Asp Arg Val Gly Met Glu lie Lys Arg 
15 10 15 

Glu Val Asn Glu lie Leu Gin Lys Lys Val Arg Asp Pro Arg Val Gin 
25 20 25 30 



30 



35 



Gly Val Thr He Thr Asp Val Gin Met Leu Gly Asp Leu Ser Val Ala 
35 40 45 

Lys Val Tyr Tyr Thr He Leu Ser Asn Leu Ala Ser Asp Asn Gin Lys 
50 55 60 

Ala Gin He Gly Leu Glu Lys Ala Thr Gly Thr He Lys Arg Glu Leu 
65 70 75 80 

Gly Arg Asn Leu Lys Leu Tyr Xaa He Pro Asp Leu Thr Phe Val Lys 
85 90 95 



Glu Ser He Glu Xaa Gly Thr Lys He Asp Glu Met Leu Arg Asn 
40 100 ' 105 HO 

Leu Asp Lys Thr Lys Glu Glu Gly Val Ala Pro Leu Phe Trp 
115 120 125 

45 (2) INFORMATION FOR SEQ ID NO: 219: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 amino acids 

(B) TYPE: amino acid 

50 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

55 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 
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Phe His His Val Thr Val Leu Leu His Glu Thr lie Asp Met Leu Asp 
15 10 is 

Val Lys Pro Glu Gly lie Tyr Val Asp Ala Thr Leu Gly Gly Ala Gly 
20 25 30 

His Ser Glu Tyr Leu Leu Ser Lys Leu Ser" Glu Lys Gly His Leu Tyr 
35 40 45 

Ala Phe Asp Gin Asp Gin Asn Ala lie Asp Asn Ala Gin Lys Arg Leu 
50 55 60 

Ala Pro Tyr He Glu Lys Gly Met Val Thr Phe He Lys Asp Asn Phe 
65 70 75 80 

Arg His Leu Gin Ala Arg Leu Arg Glu Ala Gly Val Gin Glu He Asp 
85 90 95 

Gi y Iie c ys Tyr Asp Leu Gly Val Ser Ser Pro Gin Leu Asp Gin Arq 
20 100 105 no 

Glu Arg Gly Phe Ser Tyr Lys Lys Asp Ala Pro Leu Asp Met Arg Met 
115 120 125 

25 Asn Gin Asp Ala Ser Leu Thr Ala Tyr Glu Val Val Asn His Tyr Asp 

130 135 140 

Tyr His Asp Leu Val Arg He Phe Phe Lys Tyr Gly Glu Asp Lys Phe 
I 45 150 155 160 

Ser Lys Gin He Ala Arg Lys He Glu Gin Ala Arg Glu Val Lys Pro 
165 170 175 

Iie G lu Thr Thr Thr Glu Leu Ala Glu He He Lys Leu Val Lys Pro 
35 180 185 190 

Ala Lys Glu Leu Lys Lys Lys Gly His Pro Ala Lys Gin He Phe Gin 
195 200 205 

40 Al a Iie Arg He Glu Val Asn Asp Glu Leu Gly Ala Ala Asp Glu Ser 

210 215 220 

He Gin Gin Ala Met Asp Met Leu Ala Leu Asp Gly Arg He Ser Val 
225 230 235 240 

He Thr Phe His Ser Leu Glu Asp Arg Leu Thr Lys Gin Leu Phe Lys 
245 250 255 

Xaa Ala Ser Thr Val Glu Val Pro Lys Gly Leu 
50 260 265 

(2) INFORMATION FOR SEQ ID NO: 220: 

(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 165 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 
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(ii) MOLECULE TYPE: peptide 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 

Leu Met His Val Thr Val Gly Glu Leu He Gly Asn Phe He Leu He 
15 10 15 

Thr Gly Ser Phe He Leu Leu Leu Val Leu He Lys Lys Phe Ala Trp 
20 25 30 



Ser Asn He Thr Gly He Phe Glu Glu Arg Ala Glu Lys He Ala Ser 
15 35 40 45 



Asp He Asp Arg Ala Glu Glu Ala Arg Gin Lys Ala Glu Val Leu Ala 
50 55 60 

Gin Lys Arg Glu Asp Glu Leu Ala Gly Ser Arg Lys Glu Ala Lys Thr 
65 70 75 80 

He He Glu Asn Ala Lys Glu Thr Ala Glu Gin Ser Lys Ala Asn He 
85 90 95 

Leu Ala Asp Ala Lys Leu Glu Ala Gly His Leu Lys Glu Lys Ala Asn 
100 105 HO 

Gin Glu He Ala Gin Asn Lys Val Glu Ala Leu Gin Ser Val Lys Glv 
30 115 120 125 

Glu Val Ala Asp Leu Thr He Ser Leu Ala Gly Lys He He Ser Gin 
130 135 140 

Asn Leu Asp Ser His Ala His Lys Ala Leu He Asp Gin Tyr He Asp 
I 45 150 155 160 

Gin Leu Gly Glu Ala 
165 

(2) INFORMATION FOR SEQ ID NO:221: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 629 amino acids 
45 (B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 

Met Gin Arg Leu Val Ser Leu Leu He Trp Ser Leu Leu Glu Thr Ser 
15 10 15 

He Leu Ser He His Gly Leu Gly Pro Leu Thr Lys Arg Phe Gly Val 
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20 25 30 

Ala Leu Glu His His His Met Ala Asn Tyr Asp Ala Glu Ala Thr Gly 
35 40 45 

Arg Leu Leu Phe He Phe He Lys Glu Val Ala Glu Lys His Gly Val 
50 55 60 

Thr Asp Leu Ala Arg Leu Asn He Asp Leu He Ser Pro Asp Ser Tyr 
10 65 70 75 80 

Lys Lys Ala Arg He Lys His Ala Thr He Tyr Val Lys Asn Gin Val 
85 90 95 

15 Gly Leu Lys Asn He Phe Lys Leu Val Ser Leu Ser Asn Thr Lys Tyr 

100 105 no 



20 



30 



35 



50 



60 



Phe Glu Gly Val Ser Arg He Pro Arg Thr Val Leu Asp Ala His Arg 
115 120 125 

Glu Gly Leu He Leu Gly Ser Ala Cys Ser Glu Gly Glu Val Phe Asp 

130 135 140 



Val Val Val Ser Gin Gly Val Asp Ala Ala Val Glu Val Ala Lys Tyr 
25 145 150 155 160 



Tyr Asp Phe He Glu Val Met Pro Pro Ala He Tyr Ala Pro Leu He 
165 170 175 . 

Ala Lys Glu Gin Val Lys Asp Met Glu Glu Leu Gin Thr He He Lys 
180 185 190 

Ser Leu He Glu Val Gly Asp Arg Leu Gly Lys Pro Val Leu Ala Thr 
195 200 205 

Gly Asn Val His Tyr He Glu Pro Glu Glu Glu He Tyr Arg Glu He 
210 215 220 



- A Ile Val teg Ser Leu Gly Gin Gly Ala Met He Asn Arg Thr He Gly 

40 225 230 235 240 

His Gly Glu His Ala Gin Pro Ala Pro Leu Pro Lys Ala His Phe Arg 
245 250 255 

45 Thr Thr Asn Glu Met Leu Asp Glu Phe Ala Phe Leu Gly Glu Glu Leu 

260 265 270 

Ala Arg Lys Leu Val Ile Glu Asn Thr Asn Ala Leu Ala Glu Ile Phe 
275 280 285 

Glu Pro Val Glu Val Val Lys Gly Asp Leu Tyr Thr Pro Phe Ile Asp 
290 295 300 

Lys Ala Glu Glu Thr Val Ala Glu Leu Thr Tyr Lys Lys Ala Phe Glu 
55 305 310 315 320 



Ile Tyr Gly Asn Pro Leu Pro Asp He Val Asp Leu Arg Ile Glu Lys 
325 330 335 

Glu Leu Thr Ser He Leu Gly Asn Gly Phe Ala Val Ile Tyr Leu Ala 

340 345 350 



10 
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Ser Gin Met Leu Val Gin Arg Ser Asn Glu Arg Gly Tyr Leu Val Gly 
355 360 365 

Ser Arg Gly Ser Val Gly Ser Ser Phe Val Ala Thr Met He Gly He 
370 375 380 

Thr Glu Val Asn Pro Leu Ser Pro His Tyr Val Cys Gly Gin Cys Gin 
385 390 395 400 

Tyr Ser Glu Phe He Thr Asp Gly Ser Tyr Gly Ser Gly Phe Asp Met 
405 410 415 



. _ Pro Hi - S ^ys Asp Cys Pro Asn Cys Gly His Lys Leu Ser Lys Asn Gly 

J-5 420 425 430 
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Gin Asp He Pro Phe Glu Thr Phe Leu Gly Phe Asp Gly Asp Lys Val 
435 440 445 

Pro Asp He Asp Leu Asn Phe Ser Gly Glu Asp Gin Pro Ser Ala His 
450 455 460 

Leu Asp Val Arg Asp He Phe Gly Glu Glu Tyr Ala Phe Arg Ala Gly 
465 470 475 480 

Thr Val Gly Thr Val Ala Ala Lys Thr Ala Tyr Gly Phe Val Lys Gly 
485 490 495 



T Y r G lu Arg Asp Tyr Gly Lys Phe Tyr Arg Asp Ala Glu Val Glu Arg 
30 500 505 510 

Leu Ala Gin Gly Ala Ala Gly Val Lys Arg Thr Thr Gly Gin His Pro 
515 520 525 

35 G ly G ly He Val Val He Pro Asn Tyr Met Asp Val Tyr Asp Phe Thr 

530 535 540 



Pro Val Gin Tyr Pro Ala Asp Asp Val Thr Ala Glu Trp Gin Thr Thr 
545 - 550 555 560 

His Phe Asn Phe His Asp He Asp Glu Asn Val Leu Lys Leu Asp Val 
565 570 575 



Leu Gly His Asp Asp Pro Thr Met He Arg Lys Leu Gin Asp Leu Ser 
4b 580 585 590 



Gly He Asp Pro Asn Lys He Pro Met Asp Asp Glu Gly Val Met Ala 
595 600 605 

Leu Phe Ser Gly Thr Asp Val Leu Gly Val Thr Pro Glu Gin He Gly 
610 615 620 

Thr Leu Arg Val Cys 
625 

(2) INFORMATION FOR SEQ ID NO: 222: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 693 amino acids 
60 (BJ TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
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(D) TOPOLOGY: not relevant 
(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222: 

Met Ala Arg Glu Phe Ser Leu Glu Lys Thr Arg Asn lie Gly lie Met 
15 10 is 

Ala His Val Asp Ala Gly Lys Thr Thr Thr Thr Glu Arg He Leu Tyr 
20 25 30 

Tyr Thr Gly Lys He His Lys He Gly Glu Thr His Glu Gly Ala Ser 
35 40 45 

Gin Met Asp Trp Met Glu Gin Glu Gin Glu Arg Gly He Thr He Thr 
50 55 60 

Ser Ala Ala Thr Thr Ala Gin Trp Asn Asn His Arg Val Asn He He 
65 70 75 80 

Asp Thr Pro Gly His Val Asp Phe Thr He Glu Val Gin Arg Ser Leu 
85 90 95 

Arg Val Leu Asp Gly Ala Val Thr Val Leu Asp Ser Gin Ser Gly Val 
100 105 no 

Glu Pro Gin Thr Glu Thr Val Trp Arg Gin Ala Thr Glu Tyr Gly Val 
115 120 125 

Pro Arg He Val Phe Ala Asn Lys Met Asp Lys lie Gly Ala Asp Phe 
130 ■ 135 140 

Leu Tyr Ser Val Ser Thr Leu His Asp Arg Leu Gin Ala Asn Ala His 
145 150 155 160 

Pro He Gin Leu Pro He Gly Ser Glu Asp Asp Phe Arg Gly He He 
165 170 175 

Asp Leu He Lys Met Lys Ala Glu lie Tyr Thr Asn Asp Leu Gly Thr 
180 185 190 

Asp He Leu Glu Glu Asp lie Pro Ala Glu Tyr Leu Asp Gin Ala Gin 
195 200 205 

Glu Tyr Arg Glu Lys Leu lie Glu Ala Val Ala Glu Thr Asp Glu Glu 
210 215 220 

Leu Met Met Lys Tyr Leu Glu Gly Glu Glu lie Thr Asn Glu Glu Leu 
225 230 235 240 

Lys Ala Gly He Arg Lys Ala Thr lie Asn Val Glu Phe Phe Pro Val 



245 



250 



255 



WO 98/26072 



PCT/US97/22578 



-313- 



Leu Cys Gly Ser Ala Phe Lys Asn Lys Gly Val Gin Leu Met Leu Asp 
260 265 270 

Ala Val He Asp Tyr Leu Pro Ser Pro Leu Asp He Pro Ala He Lys 
275 280 285 

Gly He Asn Pro Asp Thr Asp Ala Glu Glu He Arg Pro Ala Ser Asp 
290 295 300 

Glu Glu Pro Phe Ala Ala Leu Ala Phe Lys He Met Thr Asp Pro Phe 
305 310 315 320 

Val Gly Arg Leu Thr Phe Phe Arg Val Tyr Ser Gly Val Leu Gin Ser 
325 330 335 

Gly Ser Tyr Val Leu Asn Thr Ser Lys Gly Lys Arg Glu Arg He Gly 
340 345 350 

Arg He Leu Gin Met His Ala Asn Ser Arg Gin Glu He Asp Thr Val 
355 360 365 

Tyr Ser Gly Asp He Ala Ala Ala Val Gly Leu Lys Asp Thr Thr Thr 
370 375 380 

Gly Asp Ser Leu Thr Asp Glu Lys Ala Lys He He Leu Glu Ser He 
385 390 395 400 

Asn Val Pro Glu Pro Val He Gin Leu Met Val Glu Pro Lys Ser Lys 
405 410 415 

Ala Asp Gin Asp Lys Met Gly He Ala Leu Gin Lys Leu Ala Glu Glu 
420 425 430 

Asp Pro Thr Phe Arg Val Glu Thr Asn Val Glu Thr Gly Glu Thr Val 
435 440 445 

He Ser Gly Met Gly Glu Leu His Leu Asp Val Leu Val Asp Arg Met 
450 455 460 

Arg Arg Glu Phe Lys Val Glu Ala Asn Val Gly Ala Pro Gin Val Ser 
465 470 475 480 

Tyr Arg Glu Thr Phe Arg Ala Ser Thr Gin Ala Arg Gly Phe Phe Lys 
485 490 495 

Arg Gin Ser Gly Gly Lys Gly Gin Phe Gly Asp Val Trp He Glu Phe 
500 505 510 

Thr Pro Asn Glu Glu Gly Lys Gly Phe Glu Phe Glu Asn Ala He Val 
515 520 525 

Gly Gly Val Val Pro Arg Glu Phe He Pro Ala Val Glu Lys Gly Leu 
530 535 540 



Val Glu Ser Met Ala Asn Gly Val Leu Ala Gly Tyr Pro Met Val Asp 
545 550 555 560 

Val Lys Ala Lys Leu Tyr Asp Gly Ser Tyr His Asp Val Asp Ser Ser 
565 570 575 

Glu Thr Ala Phe Lys He Ala Ala Ser Leu Ser Leu Lys Glu Ala Ala 
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580 585 590 

Lys Ser Ala Gin Pro Ala He Leu Glu Pro Met Met Leu Val Thr He 
595 600 605 

Thr Val Pro Glu Glu Asn Leu Gly Asp Val Met Gly His Val Thr Ala 
6 1° 615 620 

Arg Arg Gly Arg Val Asp Gly Met Glu Ala His Gly Asn Ser Gin He 
10 625 630 635 640 

Val Arg Ala Tyr Val Pro Leu Ala Glu Met Phe Gly Tyr Ala Thr Val 
6 45 650 655 
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Leu Arg Ser Ala Ser Gin Gly Arg Gly Thr Phe Met Met Val Phe Asp 
660 665 670 

His Tyr Glu Asp Val Pro Lys Ser Val Gin Glu Glu lie He Lys Lys 
675 680 685 

Asn Lys Gly Glu Asp 
690 

(2) INFORMATION FOR SEQ ID NO: 223: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 274 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
{D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223: 

Ala Tyr Lys Gly His Gin Glu Tyr Val Leu Pro Gin Ala Ala Arg Lys 
1 5 10 15 

He Tyr Ala Tyr Arg Arg Tyr Asp Leu Asn Glu Ser Pro Lys Thr Ala 
20 25 30 

Leu Asp Leu He He Pro Asp Leu Phe Leu His He Leu Asn Pro Ala 
35 40 45 

Glu Arg Glu Arg Lys Leu Lys Arg Glu Gly Val Glu Glu Leu Tvr Leu 
50 55 60 



Leu Asp Phe Ser Ser Gin Phe Ala Ser Leu Thr Ala Gin Glu Phe Phe 
55 70 75 80 

Ala Thr Tyr He Lys Ala Met Asn Ala Lys He He Val Ala Gly Phe 
85 90 95 

Asp Tyr Thr Phe Gly Ser Asp Lys Lys Thr Ala Glu Asp Leu Lys Asp 
100 105 no 
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Tyr Phe Asp Gly Glu Val lie He Val Pro Pro Val Glu Asp Glu Lys 

115 120 125 

Gly Lys He Ser Ser Thr Arg He Arg Gin Ala He Leu Asp Gly Asn 
130 135 140 

Val Lys Glu Ala Gly Lys Leu Leu Gly Ala Pro Leu Pro Ser Arg Gly 



Met Val Val His Gly Asn Ala Arg Gly Arg Thr He Gly Tyr Pro Thr 
165 170 175 

Ala Asn Leu Val Leu Leu Asp Arg Thr Tyr Met Pro Ala Asp Gly Val 
180 185 190 

Tyr Val Val Asp Val Glu He Gin Arg Gin Lys Tyr Arg Ala Met Ala 
195 200 205 

Ser Val Gly Lys Asn Val Thr Phe Asp Gly Glu Glu Ala Arg Phe Glu 
210 215 220 

Val Asn He Phe Asp Phe Asn Gin Asp He Tyr Gly Glu Thr Val Met 
225 230 235 240 

Val Tyr Trp Leu Asp Arg He Arg Asp Met Thr Lys Phe Asp Ser Val 
245 250 255 

Asp Gin Leu Val Asp Gin Leu Lys Ala Asp Glu Glu Val Thr Arg Asn 
260 265 270 

Trp Ser 



(2) INFORMATION FOR SEQ ID NO:224: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 
<D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:224: 

Leu Arg Lys Glu Pro Ser Met Ala Lys Gly Glu Gly Lys Val Val Ala 
15 10 15 

Gin Asn Lys Lys Ala Arg His Asp Tyr Thr He Val Asp Thr Leu Glu 
20 25 30 

Ala Gly Met Val Leu Thr Gly Thr Glu He Lys Ser Val Arg Ala Ala 
35 40 45 

Arg He Asn Leu Lys Asp Gly Phe Ala Gin Val Lys Asn Gly Glu Val 
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Trp Leu Ser Asn Val His lie Ala Pro Tyr Glu Glu Gly Asn lie Trp 
65 70 75 80 

Asn Gin Glu Pro Glu Arg Arg Arg Lys Leu Leu Leu His Lys Lys Gin 
85 90 95 

lie Gin Lys Leu Glu Gin Glu Thr Lys Gly Thr Gly Met Thr Leu Val 
100 105 110 

Pro Leu Lys Val Tyr Met Ala Thr Leu Ser Phe Phe 



(2) INFORMATION FOR SEQ ID NO:225: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 441 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:225: 

lie Val Lys Glu Glu Lys Gly Leu Lys Glu Lys Gin Phe Trp Asn Arg 
15 10 15 

lie Leu Glu Phe Ala Gin Glu Arg Leu Thr Arg Ser Met Tyr Asp Phe 
20 25 30 

Tyr Ala lie Gin Ala Glu Leu He Lys Val Glu Glu Asn Val Ala Thr 
35 40 45 

He Phe Leu Pro Arg Ser Glu Met Glu Met Val Trp Glu Lys Gin Leu 
50 55 60 

Lys Asp He He Val Val Ala Gly Phe Glu He Tyr Asp Ala Glu He 
65 70 75 80 

Thr Pro His Tyr He Phe Thr Lys Pro Gin Asp Thr Thr Ser Ser Gin 
85 90 95 

Val Glu Glu Ala Thr Asn Leu Thr Leu Tyr Asp Tyr Ser Pro Lys Leu 
100 105 HO 

Val Ser He Pro Tyr Ser Asp Thr Gly Leu Lys Glu Lys Tyr Thr Phe 
115 120 125 

Asp Asn Phe He Gin Gly Asp Gly Asn Val Trp Ala Val Ser Ala Ala 
130 135 140 

Leu Ala Val Ser Glu Asp Leu Ala Leu Thr Tyr Asn Pro Leu Phe He 
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Tyr Gly Gly Pro Gly Leu Gly Lys Thr His Leu Leu Asn Ala lie Gly 
165 170 175 

Asn Glu lie Leu Lys Asn lie Pro Asn Ala Arg Val Lys Tyr lie Pro 
180 185 190 

Ala Glu Ser Phe lie Asn Asp Phe Leu Asp His Leu Arg Leu Gly Glu 
195 200 205 

Met Glu Lys Phe Lys Lys Thr Tyr Arg Ser Leu Asp Leu Leu Leu He 
210 215 220 

Asp Asp He Gin Ser Leu Ser Gly Lys Lys Val Ala Thr Gin Glu Glu 
225 230 235 240 

Phe Phe Asn Thr Phe Asn Ala Leu His Asp Lys Gin Lys Gin He Val 
245 250 255 

Leu Thr Ser Asp Arg Ser Pro Lys His Leu Glu Gly Leu Glu Glu Arg 
260 265 270 

Leu Val Thr Arg Phe Ser Trp Gly Leu Thr Gin Thr He Thr Pro Pro 
275 280 285 

Asp Phe Glu Thr Arg He Ala He Leu Gin Ser Lys Thr Glu His Leu 
290 295 300 

Gly Tyr Asn Phe Gin Ser Asp Thr Leu Glu Tyr Leu Ala Gly Gin Phe 
305 310 315 320 

Asp Ser Asn Val Arg Asp Leu Glu Gly Ala He Asn Asp He Thr Leu 
325 330 335 

He Ala Arg Val Lys Lys He Lys Asp He Thr He Asp He Ala Ala 
340 345 350 

Glu Ala He Arg Ala Arg Lys Gin Asp Val Ser Gin Met Leu Val He 
355 360 365 

Pro He Asp Lys He Gin Thr Glu Val Gly Asn Phe Tyr Gly Val Ser 
370 375 380 

He Lys Glu Met Lys Gly Ser Arg Arg Leu Gin Asn He Val Leu Ala 
385 390 395 400 

Arg Gin Val Ma Met Tyr Leu Ser Arg Glu Leu Thr Asp Asn Se* Leu 
405 410 415 

Pro Lys He Gly Lys Glu Leu Gly Glu Lys Ser Tyr His Ser His Ser 
420 425 430 

Cys Pro Cys Gin Asn Lys He Leu Asn 
435 440 

(2) INFORMATION FOR SEQ ID NO: 226: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 201 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 
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<D) TOPOLOGY: not relevant 
(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:226: 

Glu Leu Val Ser Thr Met Tyr Phe Arg Phe Asp Tyr Tyr Ser Gin Asn 
1 5 10 is 

Leu Gly Glu lie Phe Ala He Gly Met Val Val Gly His Leu Arg Trp 
20 25 30 

Leu He Thr Gly Ala Leu Val Leu Tyr He Phe Ala Asp Arg Lys Leu 
35 40 45 

He Asn Thr Trp Asp Phe Leu Asp He Ala Ala Pro Ser Val Met He 
50 55 60 

Ala Gin Ser Leu Gly Arg Trp Gly Asn Phe Phe Asn Gin Glu Ala Tyr 
65 70 75 80 

Gly Ala Thr Val Asp Asn Leu Asp Tyr Leu Pro Gly Phe He Arg Asp 
85 90 95 

Gin Met Tyr He Glu Gly Ser Tyr Arg Gin Pro Thr Phe Leu Tyr Glu 
100 105 no 

Ser Leu Trp Asn Leu Leu Gly Phe Ala Leu He Leu He Phe Arg Arg 
115 120 125 

Lys Trp Lys Ser Leu Arg Arg Gly His He Thr Ala Phe Tyr Leu He 
130 135 140 

Trp Tyr Gly Phe Gly Arg Met Val He Glu Gly Met Arg Thr Asp Ser 
145 150 155 160 

Leu Met Phe Phe Gly Leu Arg Val Ser Gin Trp Leu Ser Val Val Leu 
165 170 175 

He Gly Leu Gly He Met He Val He Tyr Gin Asn Arg Lys Lys Ala 
180 185 190 

Pro Tyr Tyr He Thr Glu Glu Glu Asn 



) INFORMATION FOR SEQ ID NO: 227: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 491 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: peptide 



195 



200 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 227: 

Leu Glu Asp Phe Pro Leu Ser Val Thr Asn Pro Tyr Gly Arg Thr Lys 
15 10 15 

Leu Met Leu Glu Glu lie Leu Thr Asp lie Tyr Lys Ala Asp Ser Glu 
20 25 30 

Trp Asn Val Val Leu Leu Arg Tyr Phe Asn Pro lie Gly Val His Glu 
35 40 45 

Ser Gly Asp Leu Gly Glu Asn Pro Asn Gly lie Pro Asn Asn Leu Leu 
50 55 60 

Pro Tyr Val Thr Gin Val Ala Val Gly Lys Leu Glu Gin Val Gin Val 
65 70 75 80 

Phe Gly Asp Asp Tyr Asp Thr Glu Asp Gly Thr Gly Val Arg Asp Tyr 
85 90 95 

He His Val Val Asp Leu Ala Lys Gly His Val Ala Ala Leu Lys Lys 
100 105 HO 

He Gin Lys Gly Ser Gly Leu Asn Val Tyr Asn Leu Gly Thr Gly Lys 
115 120 125 

Gly Tyr Ser Val Leu Glu He He Gin Asn Met Glu Lys Ala Val Gly 
130 135 140 

Cys Pro He Pro Tyr Arg He Val Glu Arg Arg Pro Gly Asp He Ala 
145 150 155 160 

Ala Cys Tyr Ser Asp Pro Ala Lys Ala Lys Ala Glu Leu Gly Trp Glu 
165 170 175 

Ala Glu Leu Asp He Thr Gin Met Cys Glu Gly His Gly Val Gly Arg 
180 185 190 

Ala Ser He Gin Met Asp Leu Lys Thr Lys Met Met He Ser He He 
195 200 205 

Val Pro Cys Leu Asn Glu Glu Glu Val Leu Pro Leu Phe Tyr Gin Ala 
210 215 220 

Leu Glu Ala Leu Leu Pro Asp Leu Glu Thr Glu He Glu Tyr Val Phe 
225 230 235 240 

Val Asp Asp Gly Ser Ser Asp Gly Thr Leu Glu Leu Leu Lys Ala Tyr 
245 250 255 

Arg Glu Gin Asn Pro Ala Val His Tyr He Ser Phe Ser Arg Asn Phe 
260 265 270 

Gly Lys Glu Ala Ala Leu Tyr Ala Gly Leu Gin Tyr Ala Thr Gly Asp 
275 280 285 
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Leu Val Val Val Met Asp Ala Asp Leu Gin Asp Pro Pro Ser Met Leu 
290 295 300 

Phe Glu Met Lys Asn Val Leu Asp Lys Asn Val Asp Leu Asp Cys Val 
305 310 315 320 

Gly Thr Arg Arg Thr Ser Arg Glu Gly Glu Pro Phe Phe Arg Ser Phe 
325 330 335 

Cys Ala Val Leu Phe Tyr Arg Leu Met Gin Lys He Ser Pro Val Ala 
340 345 350 



Leu Pro Ser Gly Val Arg Asp Phe Arg Met Met Arg Arg Ser Val Val 
15 355 360 365 

Asp Ala He Leu Ser Leu Thr Glu Ser Asn Arg Phe Ser Lys Gly Leu 

370 375 380 

20 Phe Ala Trp Val Gly Phe Lys Thr His Tyr Leu Asp Tyr Pro Asn Val 

385 390 395 400 



25 



40 



50 



55 



60 



Glu Arg Gin Ala Gly Lys Thr Ser Trp Ser Phe Arg Gin Leu Phe Phe 
405 410 415 

Tyr Ser He Glu Gly He Val Asn Phe Ser Asp Phe Pro Leu Thr He 
420 425 430 



Ala Phe Val Ala Gly Leu Leu Ser Cys Phe Leu Ser Leu Leu Met Thr 
30 435 440 445 

Phe Phe Val Val Val Arg Thr Leu He Leu Gly Asn Pro Thr Ser Gly 
450 455 460 

35 Trp Thr Ser Leu Met Ala Val He Leu Tyr Leu Gly Gly He Gin Leu 

465 470 475 480 



Leu Thr He Gly He Leu Gly Lys Tyr Asn Gin 
485 490 

(2) INFORMATION FOR SEQ ID NO: 22 8: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 277 amino acids 
45 <B) TYPE: amino acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:228: 

Val He He He Asp Asp Asn Tyr Ser Asn Val Asn Leu Arg Asn Lys 
15 10 15 

He He His Gin Phe Gly Tyr Thr Asn His Arg He Lys Leu He Leu 
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Ser Asn Glu Asp Leu Gly Ala Thr Asn Ala Arg Asn He Gly He Lys 
35 40 45 

Asn Ser Arg Gly Lys Tyr He Ser Phe Leu Asp Asp Asp Asp Glu Tyr 
50 55 60 

Met Pro Asp Arg He Leu Lys Leu Met Ala Cys Phe Lys Lys Ser Arg 
65 70 75 80 

Met Lys Asn Leu Ala Leu Val Tyr Ser Tyr Gly He He He Tyr Pro 
85 90 95 

Asn Gly Thr Arg Glu Glu Glu Lys Thr Asp Phe Val Gly Asn Pro Leu 
100 105 HO 

Phe Val Gin Met Val His Asn He Ala Gly Thr Ser Phe Trp Leu Cys 
115 120 125 

Lys Lys Glu Val Leu Glu Leu He Asn Gly Phe Glu Lys He Asp Ser 
130 135 140 

His Gin Asp Gly Val Val Leu Leu Lys Leu Leu Ala Gin Gly Tyr Gin 
145 150 155 160 

He Asp He Val Arg Glu Phe Leu Val Asn Tyr Tyr Ala His Ser Lys 
165 170 175 

Glu Asn Gly He Thr Gly Val Thr Gin Lys Thr He Asn Ala Asp Glu 
180 185 190 

Glu Tyr Tyr Asn Tyr Cys Arg Lys Tyr Phe Asn Leu Leu Ser Phe Asn 
195 200 205 

Glu Arg He Leu Val Thr Lys Lys Tyr Tyr Ser Leu Asn He Lys Arg 
210 215 220 

Leu Leu Leu He Gly Asp Lys Cys Lys Ala Leu Lys Val He Lys Lys 
225 230 235 240 

Ala Arg Glu Glu Lys He Phe Asn Glu Phe Leu Phe Leu Lys Tyr Met 
245 250 255 

Leu Leu Tyr Arg Ser Phe Phe Tyr Cys He Tyr Asp Asn Tyr Val Gin 
260 265 270 

Leu Lys Phe Arg Lys 



275 
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CLAIMS 



1. An isolated nucleic acid compound comprising a 
sequence identical to or substantially identical to a 

5 sequence selected from the group consisting of SEQ ID NO:l 
through SEQ ID NO: 86. 

2. An isolated nucleic acid compound comprising a 
sequence identical to or substantially identical to a 

10 sequence selected from the group consisting of SEQ ID NO: 87, 
SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 95, SEQ 
ID NO: 97, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID 
NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, SEQ ID NO: 111, SEQ ID 
NO: 113, SEQ ID NO: 115, SEQ ID NO: 117, SEQ ID NO: 119, and SEQ 

15 ID NO:121. 

3. A substantially purified protein or fragment 
thereof from S. pneumoniae wherein said protein is selected 
from the group consisting of SEQ ID NO: 88, SEQ ID NO: 90, SEQ 

20 ID NO: 92, SEQ ID NO: 94, SEQ ID NO: 96, SEQ ID NO: 98, SEQ ID 
NO:100, SEQ ID NO:102, SEQ ID NO:104, SEQ ID NO:106, SEQ ID 
NO: 108, SEQ ID NO: 110, SEQ ID NO: 112, SEQ ID NO: 114, SEQ ID 
NO: 116, SEQ ID NO: 118, SEQ ID NO: 120, SEQ ID NO: 122, and SEQ 
ID NO: 123 through SEQ ID NO: 228. 

25 

4. An isolated nucleic acid compound encoding any 
one of the proteins or fragments thereof of Claim 3. 

5. A vector comprising any one of the nucleic acid 
30 compounds of claims 1, 2, or 4 . 



6. A recombinant host containing any one of the 
vectors of claim 5. 
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7. A substantially purified protein from 
Streptococcus pneumoniae as in Claim 3 wherein said protein 
is an external target protein selected from Table 1. 

8. A substantially purified protein from 
Streptococcus pneumoniae as in Claim 3 wherein said protein 
is a hypothetical protein selected from Table 1. 

9. A substantially purified protein from 
Streptococcus pneumoniae as in Claim 3 wherein said protein 
is a cell wall synthetic protein selected from Table 1. 

10. A substantially purified protein from 
Streptococcus pneumoniae as in Claim 3 wherein said protein 
is a minimal gene set protein selected from Table 1. 

11. A DNA chip having arrayed thereon any at least 
15 base pair fragment of any one or more of the nucleic 
acids of claim 1. 

12. A DNA chip having arrayed thereon any at least 
15 base pair fragment of any one or more of the nucleic 
acids of claim 2. 

13. A method for evaluating gene expression in 
Streptococcus pneumoniae comprising the step of incubating a 
DNA chip of claim 11 or Claim 12 with cDNA prepared from 
Streptococcus pneumoniae under conditions suitable for 
hybridization of complementary nucleic acid sequences. 

14. A computer readable medium having recorded 
thereon any one or more of the nucleotide sequences of 
Claims 1 or Claim 2. 
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15. A method for identifying virulence genes in S. 
pneumoniae, comprising the steps of: 

a) preparing a DNA chip as in claim 11, 

b) preparing labeled cDNAs from 

5 i) S. pneumoniae cells recovered from an in 

vivo environment, and 

ii) S. pneumoniae cells grown in vitro, 

c) hybridizing individually the cDNAs of steps 
(b) (i) and (b) (ii) to a chip of step (a); and 

10 d) identifying a genomic DNA fragment or fragments 

on said chip that hybridize to the cDNAs of (b) (i) but not 
with the cDNAs of (b) (ii) . 

16. An antibody that selectively binds to a 
15 protein or peptide of Claim 3. 



external target protein, or fragment thereof, identified in 
Table 1. 



comprising a layer of S. pneumoniae cells wherein said layer 
contacts with said nucleic acids. 



17. An antibody that selectively binds to an 



20 



18. A DNA chip of Claim 11 or Claim 12 further 
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